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METHOD OF DNA SHUFFLING WITH POLYNUCLEOTIDES PRODUCED 
BY BLOCKING OR INTERRUPTING A SYNTHESIS OR AMPLIFICATION 

PROCESS 

Field of the Invention 

5 This invention relates generally to the ficldof molecular biology and more 

specifically to the preparation of polynucleotides encoding polypeptides by generating 
polynucleotides via a procedure involving blocking or interrupting a synthesis or 
amplification process with an adduct. agent, molecule or other inhibitor, assembling the 
polynucleotides to form at least one mutant polynucleotide and screening the mutant 
10 polynucleotides for the production of a mutant polypeptide^) having a particular useful 
property. 

Description of the Related An 

An exceedingly large number of possibilities exist for purposeful and random 
combinations of amino acids within a protein to produce useful mutant proteins and their 
!5 corresponding biological molecules encoding for the mutant proteins, i.e.. DNA. RjNA; 
etc. Accordingly, there is a need to produce and screen a wide variety of such-mutant, 
protons for a useful utility, particularly widely varying random-proteins. 

Die following general discussion of protein and polynucleotide fields may be 
helpful in further understanding the background for the present invention. 
20 The complexity of an active sequence of a biological macromolecule, e.g., 

proteins, DNA etc., has been called its information content ("IC"; 5-9), which has been ' 
'defined as the resistance, of the active protein to amino, acid sequence variation 
(calculated from the minimum number of invariable amino acids (bits)) .required to 
describe a family of related sequences with the same function. Proteins that arc more 
25 sensitive to random mutagenesis have z high information content. 

Molecular biology developments such as molecular libraries have allowed the 
identification of quite a large number of variable bases, and even provide ways to select. 



WO 98/01581 

.' PCT/US97/I2239 



. 0 



functional sequences from random libraries, b such libraries, most residues can be 
Varied (althouBh typicaHy.no. all at thesan* «imc) dc^nding 0 n compensating changes 
■ in the context.. 'Thus, whiles 'LOO amino, acid protein can contain only 2.000 different 
mutations, 20 100 combinations of 'mutations i are possible. 
5 ... Information density is the Information Content per unit length of a sequence/ ' 

Active sites of enzymes .end to have a high information density. By contrast, flexible 
linkers of information in enzymes have a low information density. ' 
' '• r' ' Currenvmethods in widespread use for creating mutant proteins in a: library 
;■ . format are.error-prone.polymerase chain reactions and cassette mutagenesis, in which- the ' 
.10 specific region to.be optimized is replaced with a synthetically mutagenized olieonucleb- 
. ,..de. In both cases; a- cloud. of mutant sites is generated around cenain sites m. the . 
original sequence/ . - V ■ > . % , '/ • 

'. E ^-P^CK uses low-ndcHry poiy ; ,cH,n:ion conditions to. inuoducc a - 
. low levelof point mutations randomlyovcr a long se^n.c. in a mixture of fragments- 
15 Of U ^ 0Wn - SGq ^ RCC - error-prone PCR-can he used t>mutageni 2 e tne mixture The 
pubhshed error-prone PCR protocols suffer from a low proeessivity of the polymerase 
, . n.retore.theprotocolis.unableto result in, he random mutagenesis of an average-sized • 
gene. T^mabitity limits the practical a? plic 2 tio: 5 of crror^one PCR Somcxornpute* 
; s-mulanons have, suggest point mutaecncsis,l 0 nc,may of.en be too eradua. to 

-° al !°^the large-scale block chanaes- that a-,- -<,j-, d „ n J .\ 

■ , • b - •~- JI --~ continued and dramatic > 

sequence evolution: Further, the .publish --.->- r-->n« ern' , , 

pujn>.,_ v . i0 .-n.one PCR protocols-do notallow-fbr 

amplincanon ofDNA fragments greater than 0.5 to.. 1 .0. . U, limUin* their pract.ca, 
apphcatton. In addition, repeated cycles of error-prone. PCR can .ead to in accumulation ' ' 

of. neutral mutations with •/-undesired;... results. - '^ih as ■ affectimr-v-— V- ' ' 

9S - 1 as ■attecting a . protein s 

'rnmunogenicity but hot its binding affinity. ,- - ■ ' 

In oligonucleotide-oirected mutagenes.s. a short sequencers replaced with a 
synthetically mutagenic oligonucleotide. This approach does not generate combina- 
fons of d.stan, mutations and is thus no, combinatory. The limited library size relative 
to the vast sequence -length means that many rounds of selection are unavoidable for 
Protem. optimization. Mutagenesis with symthenc oligonucleotides requires sequencing 
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of individual clones after each selection round followed by grouping them into families, 
arbitrarily choosing a single family, and reducing it to a consensus motif. Such motif is 
resynthesized and reinserted into a single gene followed by additional selection. This 
step, process constitutes a statistical bottleneck, is labor intensive, and is not practical for 
5 many rounds of mutagenesis. 

Error-prone PCR and oligonucieotide-dirccted mutagenesis are thus useful ' 
for single cyclcs of sequence fine tuning, but rapidly become too limiting when they are 
applied for multiple cycles. • 

Another serious limitation of error-prone PCR is that the rate of down- 
10 mutations grows with the information 'content of the sequence. As the information 
content, library size, and mutagenesis rate increase, the balance of down-mutations to up-., 
mutations will statistically prevent the selection of runner improvements (statistical 
ceiling). . V 

In cassette mutagenesis, a. sequence biock of a. single template is typically. -/ \ 
.15 replaced by a (partially) randomized sequence. Therefore, the maximum information v 
content that can be obtained is statistically limited by the number of random sequences' 
(i.e.. library'sizc). This eliminates other sequence families which ire not currentlv best, 
but which may have greater long term potential. "\ 
Also, mutagenesis with synthetic oligonucleotides requires sequencing of -H 
20 individual clones after each' 'selection round. Thus, such an approach is. tedious and 
impractical for many rounds of mutagenesis: 

Thus,, error-prone PCR and cassette mutagenesis are best suited, and have 
been widely used, for f:ne-tuning areas of comparatively low information content. One 
apparent exception is the selection of an RNA ligase ribozyme from a random library • 
25 using many rounds of amplification by 'error-prone PGR and selection. 

It is becoming increasingly clear that the tools for the design of recombinant ' 
linear biological sequences such as protein. RNA and DNA are not as powerful as the 
* tools nature has developed. Finding better and better mutants depends on searching more 
and more sequences within larger and larger libraries, and requiring'increascd numbers ' 
30 of cycles of mutagenic amplification and selection. However as discussed above, the 
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existing mutagenesis methods that are in widespread use have distinct limitations when 
used for repeated cycles. : ^ 

In nature the evolution of most organisms occurs by natural selection and " « 
sexual reproduction. Sexual reproduction ensures mixing and combining of the genes 
5 in the offspring of the selected individuals. During meiosis, homologous chromosomes 
from the parents line up with one another and cross-over pan way along their length, thus - 
randomly swapping genetic material. ■ Such swapping or shuffling of the DN A* allows 
organisms to evolve more rapidly. 

, In sexual recombination, because the inserted sequences were of proven 
10 utility in a homologous environment, the inserted sequences are likely to still have 
substantial information content once they arc inserted into the new sequence. 

Mar-ton' et al. -describes the use of PCR in v/vroto monitor recombination. in 
a plasmid having directly repeated sequences. Manor, e: al. disclose that recombination 
will occur during PCR as a result of breaking or n i c i - ^ of the DNA. This will give rise 
15 to recombinant molecules.. Meyerhans e.i aL aisc- disclose the existence of DNA 
recombination during in vitro PCR.* .. ' 

The term Applied Molecular Evo.iui:on ("AME"T means ihe application of an 
"evolutionary design algorithm to a specif;:, useful goal,. While-many different li bran- 
formats for A ME have been reported- fur noivr^eleoiides, -peptides and proteins (phage. 
20 ]?lZ] and polysomes), none of these forma: s ;-vc provided for- recombination by random - ' 
cross-overs to deliberately create a combinatorial library.. 

Theoretically there are 2,000 different single mutants of a 100 amino acid 
■ protein. However, a protein- of 100 amino acids has <20 ,:c possible combinations of 
mutations, a number which is too large to exhaustively explore by conventional methods. 
25 lr would bc advantageous to- develop a system which would allow generation and' 
screening of all of these possible combination mutations. 

Some workers in the an have utilized an in v/vo site specific recombination 
system to combine light chain antibody genes w:n heavy chain antibody genes for 
expression in a phage system. However, their system relics on specific sites of 
30 recombination and is limited accordingly. Simultaneous mutagenesis of antibody CDR ' 
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regions in single chain antibodies (scFv) by overlapping extension. and PCR have been . ' 
reported. / ., . ■■ ' ■ * 

v ■ Others have described a method for generating a large population of multiple, 
mutants, using random in vivo recombination. However, their method requires the 
5 recombination of two different libraries of plasmids, each library having a different' 

* selectable marker. Thus, thcirmcthod is limited to a fini te number of recombinations 
equal to the number of selectable markers existing;- and produces a concomitant linear 
increase in -the number of marker genes linked to the selected scquence(s). 

In vivo recombination between two hqmologousbut truncated insect-toxin 
10 gents on a plasmid have been reported as also being capable of producing a hybrid gene. 
The in vivo recombination of substantially mismatched DNA, sequences in a host cell 
' having defective mismatch repair enzymes, resulting in hybrid molecule formation' has 
- been reported. * • - . ' 

' ' ■ . . As discussed above, prior methods for .producing random proteins from • 
15 "randomized genetic material have met- with. limited success. Perhaps the best method, 
thus far. .for producing and screening a' wide variety of random proteins is a method 

* which utilizes enzymes to cleave (chop) a long nucleotide chain into- shorter, pieces- 
followed by* prc5cedur.es to separate the. chopping' agents from the genetic material and' * 
procedures to amplify (multiply the copies of) -the remaining genetic material in a manner * / , 

20' : mat allows the annealing of the poivnu:!eof dcs'baek into chains (either purposefully or *■ , 
- • randomly put ihcnvback together i. ' ^ . . ,.■'■..< 

A drawback to this method.is the expense and inconvenience of utilizing • 
biological enzymes to chop up the genetic materia!, which are then.separated from the' 
, genetic material prior to the- amplification step. Further, depending upon the, particular' 
25 genetic material, different concentrations ofthe chopping agents are required to produce 

the desired fragments. Moreover; the control mechanisms required for biological ■• V 
cn7ymcs : are not trivial. ' . ■ " . ■ • ' ■■ / \ : , 

, Accordingly, there is a need in the an for producing an improved, method of ' - 

obtaining truly random pieces of genetic material- for reassembly' Jo. produce random- 1 '- 
30 proteins which may be screened for a particular use. The need to produce 1 large libraries- , ■ 
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of widely varying mutant nucleic acid sequences is an important goal. Hence, il would 
be advantageous to develop such a method for the production of mutant proteins which 
' • allows for the development of large libraries of mutant nucleic, acid sequences which are 
easily searched.- There is a need to develop such, a method which allows for the 
5 production of large libraries of mutant DNA, RNA or proteins and the selection of. 
-> particular mutants for a desired-goal, . ^ * ••. - ^ 
: ; T ; The invention described herein- is directed to the use of repeated cycles of 
mutagenesis, recombination and selection; which iilow for the directed molecular 
■ evolution of highly complex linear sequences, such as DNA, RNA or proteins thorough ■ . 
,10 recombination. It uses repeated cycles of .random, points mutagenesis, nucleic, acid . 
.. . shuffling-and' selection which allow for the directed molecular evolution in vitro - of 
highly complex linear sequences, such ■ as proteins'thrqugh random recombination. 

V • . ; SUMMARY of the invention , ' \ 

' The present invention is directed to a method for generating a selected mutant ^ 
* 15 polynucleotide sequence (or a population of selcc;ed polynucleotide sequences) typically 
in the form of amplified and/or clonsd-polynucilcotides; whereby the selected polynucleo- 
tide sequencers) possess at least one desire J';phenotypic characteristic (e.g., encodes a / 
..polypeptide, promotes .transcription cf l!nfedpolynuclequdes,;binds a, protean, and the- 
like); which can be selected for. One method: for ideniHying mutant polypeptides that 
20: possess a desired -structure or functional property, such ^as'binding^c a predetermined 
. ..' ' biological macromolccule (c.g.va receptor), involves the scrccning.of a large library'' of 
' polypeptides .for individual 'library' members .which possess the desired structure or; 
■ ♦ functional property conferred by .the amino acid sequence of the polypeptide. - - ; 
.. ; ' . ; Irrone embodiment* the present invention provides a method. fof generating-. 
25 : ^libraries of displayed polypeptides or displayed antibodies suitable for affinity interaction 
screening or phenotypic screening. The method comprises (1) obtaining a first plurality 
of selected library members comprising a displayed polypeptide or displayed antibody 
and an associated polynucleotide encoding said displayed polypeptide or displayed 
antibody, and obtaining said associated polynucleotides or copies thereof wherein said 



■ WO 98/01581 



PCIYUS97/12239 



- 7 - 



associated polynucleotides comprise a region of substantially identical sequences, 
optimally introducing mutations into said polynucleotides or copies, (2) pooling the 
polynucleotides' or copies, (3) producing smaller or shorter polynucleotides by 
interning a random or particularized priming and synthesis process or an ampl.ficauon 
5 process, and (4) performing amplification, preferab.y PGR amplification, and opi.on.lly 
mutagenesis to homologous* rccombme the newly synthesized polynucleotides. 

It is a particularly preferred object of the invention to provide a process for 
producing mutant polynucleotides which express a useful mutant polypeptide by a series 

of steps comprising: . ■ * ' . * 

10 (a) producing polynucleotides by interrupting a polynucleotide 

amp.ificauon or synthesis process with a means for. blocking or interrupting the 
ampHficat.on.or synthesis process and thus proves a plurality of smaller or shorter 
polynucleotides due to the replication of the polynucleotide being in various stages o ? 

complcticn;: • ' ;' 

15 • {b) " adding to. the rc;,itant population of single- or double-stranded 

' polynucleotides one or more single- or doubie-stranded oligonucleotides, wherein said 
added oligonucleotides comprise an area of identity in an area of heterology to one or 
more of the single- or double-stranded polynucleotides of the population; 

(c) " denaturing the resulting single- or double-stranded oligonucleotides 
20 ,0-procucc a mixture of single-stranded polynucleotides, optionally separating the shorter 

or smaller polynucleotides into pools of polynucleotides having various lengths and 
further optionally subjecting- said pol^ucleotides to a PCR procedure^ amplify one or 
■ more oligonucleotides comprised by at least one of said polynucleotide pools; 

(d) incubating a plurality of said polynucleotides or at least one poo! of 
25 said' polynucleotides with a polymerase. under conditions which result in annealing of 

.' 'said single-stranded polynucleotides at regions of identity between the single-stranded 
polynucleotides and thus forming of a mutagenized double-stranded polynucleotide 

chain; . , 

(e) optionally repeating steps (c) and (d); 
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} (0 expressing, at least one mutant polypeptide from said polynucleotide 

chain, or chains; arid • • . . ' * : "., r / ; -\ 

,(g) screening said at least'one mutant polypeptide for a useful "activity. 

,:. . ;In a preferred-aspect of the invention, the means for blocking or interrupting . 

; 5 the amplification' or synthesis process is by utilization of uv light, DNA adducts, DNA 
binding .proteins. Preferably, the DNA adduct is a member' selected from the group 

• . .consisting of:; r J - ■ ■ • — ' -v, ■ _-■ - ■ .. - - - • ; y- / 

r : UV light; (+)-CC- L065 ; (+)-CC- 1 065-(N3-Adeninc); a N-acclylated or deacctylatcd i- 
fluro^amipobiphenyl adductcapable of inhibiting DNA synthesis, bra N-acetylatcd or ' ■■ 
10 . deacetylated 4-aminobiphcnyi adduct; capable of inhibiting DNA synthesis; irivalent 
; chromium; a trivalent chromium sail, a polycyclic aronvatic hydrocarbon ("PAH") DNA 
aaductcapaBle of inhibiting DNA replication, suchas 7-brombmcthy].ben2ta]anthniccnc- 
' C'B^:'); tris(2 v 3-dibromopropyl)phosphiltc ('Tris-B?'')/ L2-dibromo-3-chloropropanc 

• ■(■'DBCP 1 ');. 2-bromoacroicin': (2BA);; benzoyl pyre^ 

15' .C'BPDE''); a platinum(ll) halogen salt; N-hydrcxy-2-2minoo-mLMhylimida2of4!5/|-- • 
quinoline ("N-hydroxy-lQ"); and N-hydroxy-2-amiho- ! -rr.cthyN6.:phcnyiimida2pf4,5 : /]- ■ 

• pyridine ("N-hydroxy-PhlP"). . : 1 • . • <\ 

' Especially; preferred members irom'the grouping consis'rof UV iighl,(+)-CC- 
". 1065' and (+)-CG-IOo>(^3-Adenincl : , _\^, \ ; ' ■ . -; , ' 
20 . ' In one emWdimenv o: th^ ; 

comprising the D.NA.adduc:s, are removed from the polynucleotides or polynucleotide 

• ■ pool/such-as'byaprocess.includir.g heatmg'the solution comprising the DNA fragments 

■ ' prior to further processing. " ■ • 1 ",■ \- : ■ ' ' , ' 



25 



^ Detailed Description of the Invention 

The present invention relates to an-.enhar.ced method .of DNA! "shuffling," 
which may be referred to as "Sexual PCR." In a preferred embodiment of the present 
invention, amplified or cloned polvmuclcotides possessing a desired characteristic (for 
example, encoding a polypeptide of interest, etc.) are selected (via screening of. a library 
of polynucleotides, for example) and pooled. The pooled polynucleotides (or at .least one 
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polynucleotide) may be subjected to random a, .east one of random primer extenston 
reacttons or PCR amplification using random primers to multiply pomons of the 
polynucleotide or polynucleotides. At various stages along the completion of the PCR 
amplification or. synthesis- process. ,hc process may be blocked or intended. Hence, 
a collection of incomplete copies of the polynucleotide or polynucleotides can be' 
generated by random pnmer extension reactions, amplification using random primers, 
and/or by pausing or stopping- the replication process. 

These collections' of shorter or smaller polynucleotides (pools) may be 
isolated or collectively amplified further by PCR, which may be interrupted again. Such 
"stacking" of the amplification and pausing or stopping steps has the advantage of 
producing a truly randomized sample of polynucleotide having widely varying lengths. 
Fo- example, some of the smaller P o!ynu:ieo.ides may hybridize with the longer 
polynucleotides and act additional random printers to initiate self-priming amplifica- 
tion of polynucleotides within the pool. 

Such a process provides an e::,:::nt means for producing widely-varying 
random polynucleotides and subsequent uidely-varying mutant proteins corresponding 
f0 the same random selection as in the random polynucleotide pool. .The. reassembly of 
the shorter or smaller polynucleotides after such shuffling to produce the random 
polynucleotides may be provided by utilizing procedures standard in the an. 

In one embodiment of the invention, the adductor adducts which halt or slow 
the PCR process have bcerrmodificd with a chemical group for which there exists (or can 
be obtained) a monoclonal antibody specific for the same/ Such is an example permitting 
an efficient separation of polynucleotide chains comprising the DN A adducts (or for the 
removal of the adducts which have been released from the DNA polynucleotides which 
5 • comprise them) from other polynucleotide chains. In some situations, it may be desirable 
to remove such DNA adducts before further processing of the amplified polynucleotides. 
• In other situations it may be desirable to leave such DN A adducts in the solution with the 
: intention of producing a further randomized pool of polynucleotides. Whether the DNA 
■ adducl is to be removed or left within t'-.e' polynucleotide pool depends upon the 
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composition of the adduct itself and the immediate goal of that amplification process 
step. 

In a preferred embodiment, the polynucleotides produced by interrupting the 
PCR amplification (and optionally subsequent amplification of the said polynucleotides 
5 to produce further randomization under conditions suitable, for PCR amplifications) arc 
recombincd to form a shuffled pool of recombined polynucleotides, whereby a 
substantial fraction (e.g., greater than 10 percent) of the recombined polynucleotides of 
said shuffled pool were not present in the first plurality of selected library members, said 
shuffled pool providing a library ofdisplayed polypeptides or displayed antibodies 
TO suitable for affinity interaction screening. 

Optionally, the method comprises the additional step of screening the library 
members of the shuffled pool to identify individual shuffled library members having the 
ability to bind or otherwise interact (e.g., such as catalytic antibodies) wilh a.prcdetcr- 
■ mined macromolecule, such dls for example a proteinaecous receptor, peptide oligosac- 
15 charide, viron, or other predetermined compound or structure: . 

The displayed polypeptides, antibodies, peptidorriimctic antibodies, and 
variable region sequences that are identified- from such libraries can be used -for 
therapeutic, diagnostic, research and related purposes (e.g.', catalysts, solutes' for 
increasing osmolarity .of an aqueous- solution, and the like), nnd/or can be subjected to 
20 one or more additional cycles of shuffling and/or affinity selection. The method can be 
modified such that the step of selecting tor a phenotypic characteristic can be other than 
of binding affinity for a predetermined molecule (e.g., for catalytic activity, stability 
• ' oxidation resistance, drug resistance, or detectable phenotype conferred upon a-hos*. cell). 

in one embodiment," the first plurality of selected- library members is 
25 polynucleotides is produced and homologous!)' recombined by -PCR in v/7ro,.Lhe resultant 
polynucleotides are transferred into a host cell or organism via a transferring means and 
homologously recombined to form shuffled library members in vivo. 

In one embodiment, the first plurality of selected library members is cloned 
or amplified on episomaJly rcplicablc vectors, a multiplicity of said vectors is transferred 
30 into a cell and homologously recombined to form "shuffled library members in vivo. 
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■ In one embodiment, the first. plurality of selected library members is not 
produced as shorter or smaller polynucleotides, but is cloned or amplified on. a 
episomallyreplicable vector/as a direct repeal, with each repeat comprising ;a distinct 
species of selected library member sequence, said vector is transferred into a ceil and . 
5 homolbgously recombined by intra-vector recombination to form shuffled library 
members in vivo. 

In an embodiment, combinations of in vitro and in vivo shuffling are provided 
to enhance combinatorial diversity. 

The present invention provides a mcthod'for generating libraries of displayed 
10 antibodies suitable for affinity interactions screening. '..The method comprises (1) 
obtaining first a plurality of selected library members comprising a displayed antibody 
and an associated polynucleotide encoding said displayed antibody, and obtaining said 
associated polynucleotide encoding tor said displayed antibody and .obtaining said. 
' associated polynucleotides or copies thereof, wherein said associated polynucleotides 
15 comprise a region of substantially identical variable region framework sequence, and (2) 
pooling and producing shorter or smaller polynucleotides wkrr said associated 
polynucleotides- or copies to form polynucleot-desunder.condinons suitable for PCR ■ 
. amplification by -slowing or halting the PCR. amplification and thereby homologously 
- 'rccombinirig said shorter or smaller polynucleotides- to form- a shuffled, pool of 
. 20 recombined- polynucleotides of said shuffled- P oo!-: C'DR combinations comprised by the 
shufTled poo'l are not -present in the f:.« plurality of selected library members, said 
•ir';ll:kd ; pcv^!comp<-;i:l£a !ib-^"- of displayed anii-idies comprising CDR permutations "• 
and suitable for affinity interaction screening.' Optionally, the shuftled pool is subjected 
to affinity screening to select shuffled- library members which bind- to a predetermined 
25,. epitope (antigen) andthcreby selecting a plurality of selected shuffled library-members. ' 
: Further, the.plurality of selecte'dly shuffled library members can be shuffled and screened 
itcrativcly, from 1 "to about 1000 cycles or as desired until library, members having a 
desired binding affinity are obtained- ■'. 

: " ' According one aspect of -the present invention provides a method for 
30 introducing one or more mutations into a template double-stranded polynucleotide, • 
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wherein the template double^stranded'pdlynucleotide hasproduced polynucleotides of 
a desired size by the above slowed or halted PCR process, by adding to the resultant 
population -of double stranded polynucleotides one or more single or double stranded 
oligonucleotides, wherein said oligonucleotides comprise an area of identity and an area 

5 of heterology to the template polynucleotide; denaturing the resultant mixture of double- 
stranded random polynucleotides and oligonucleotides into single-stranded polynucleo- 

... tides; incubating the resultant population of singl.e : stranded polynucleotides with a 
polymerase under conditions ' which result in the annealing of said single-stranded 
polynucleotides and formation of a mutagenized double-stranded polynucleotide; and 

1 0 /.repeating the above steps as desired. 

In another aspect the present invention is directed to a method of producing 
recombinant' proteins having biological activity by treating a sample comprising double- 
stranded template polynucleotides encoding 7 a wi'ld : !ypc protein under sexual PCR 
conditions according to- the present invention which provide for the production of 

15 poly^ucleotides which include random doub v e-s;randcd polynucleotides having. a desired 
size and adding to the resultant population of random polynucleotides one or more single ; 
■ or double-stranded oligonucleotides, wherein said oligonucleotides comprise areas of* 
identity and areas of heterology to -the template polynucleotide.;, denaturing the resulting 
mixture of double-stranded polynucleotides iand oligonucleotides. into single-stranded • 

20 polynucleotides; incubating the resultant population of slngle'-stranded polynucleotides 
' with a polymerase under conditions which'" cause annealing of said single-stranded 
'polynucleotides at the areas of identity to occurand thus to form at least one mutagenized. 
double-stranded : polynucleotide;- repeating the above steps as desired; and . then 
expressing the recombinant protein from the mutagenized double : stranded polynucleo- 

25: tide. \ ' . : " ' . • ( . \. . , 

1 A third aspect of the present invention is directed to a method for' obtaining 
chimeric polynucleotide by treating a sample comprising different double-stranded 
template polynucleotides wherein said different template polynucleotides contain areas 
of identity and areas of heterology under sexual PCR conditions which provide random 

30 double-stranded polynucleotides of a desired size from the template polynucleotide; 
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■ denaturing the resulting random double-stranded polynucleotides to, provide single- 
handed polynucleotides; incubating the resulting single-stranded polynucleotides with , 
a polymerase under conditions which .provide for the annealing of the single-stranded 
' . polynucleotides at the areas of identity and the formation of a chimeric double-stranded 
5 polynucleotide sequence comprising-template, polynucleotide sequences; and repeating, 
the above steps as desired. . -• 

"■ " A fourth aspect of the present invention is directed to a method of replicating 
a template polynucleotide by combining in v//r 0 single-stranded template polynucleo- 
tides with small random single-stranded polynucleotides resulting from the sexual- PCR 
10 process according to the, present invention and denaturation of the template polynucleo- 
tide, and incubating said mixture of nucleic acid polynucleotides in the presence of a 
nucleic acid -polymerase under conditions wherein a population of double-stranded . 
template polynucleotides is fanned. _ .' 

. "'. The invention also provides the use of polynucleotides shuffling, m vitro , 

15 . and/or in vivo to shuffle polynucleotides encoding F olypeptidcs.and/or pol^ucleoudW 
comprising transcriptional regulatory sequences. 

The invention also provMcs'ihe use of poiynucleotidc : shuffiing to shufJle a •• 
. . population of viral- genes (e.g., capsid' proteins, spike glycoproteins, polymerases, 
proteases, etc.) or Viral genomes (eg,, paramyxoviridie. orthomyxovicidae. ' 
20 herpesviruses., retroviruses, reoviruscs. thinoviruscs, etc.).. .In an embodiment, the 
. 1 ..invention. .provides . a .method . for shuffling sequenccs_ encoding^ a!l_ or _ portions of . 
•.immunogenic viral proteins to generate novel combinations of cpi.topes-as well as novel " 

■ . epitopes created by recombination; such shuffled viral proteins may comprise epitopes ■: 

"or combinations of epitopes as well as- novel, epitopes created by recombination;- such 
25 shuffled viral proteins may comprise- epitopes or combinations of. epitopes which/are 
'• ukely to arise ur ine natural environments a consequence 0 r viral evolution; (e.g.. such . 

■ ' as recombination ofinnuenza virus strains). . - - - . 

The invention also provides a method suitable for shuffling polynucleotide 

' sequences for generating gene therapy vectors and replication-defective gene therapy 
- 30 constructs, such as may be used for human gene therapy, including but not limited to 
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vaccination vectors for; DNA-based vaccination, as well as anti-neoplastic gene therapy 
and otheT general therapy formats. : , 

^ BRIEF DESCRIPTION QF THE DRAWINGS ; 

■/■■^ Figure 1 is a prior art diagram illustrating the resulting mutant polynucleotide 
5. from mutations by error-prone PCR as contrasted With those from shuffling; and 
recombination of shorter or smaller polynucleotides. • . ' '• , '/ ~ Y 

s'._ . Figure 2 is a flow chart which iil,ustrates,the,principles of Sexual PGR in three. 

, basic steps: (1) selccting mutants for generaiion of random sized ^polynucleotides of 
polynucleotides, (2) . generating -random-sized polynucleotides by halting the PCR 
TO process, and reassembling the random-sized polynucleotidcs^via PCR io form random 
polynucleotides. , ' \ ■ . \\ V , ~ . 

- . ; Figure- 3 is a fiow "chart which .illustrates the concepts of utilizing DNA t . 
/- "■ adducts or UV light to halt PGR and to generate ■ rand'om polynucleotides due to random 
. .' priming and incomplete extension of the strands. , / '. V J ■ 

15 ' '. Figure 4 is a list of DNA adducis examples and UV light which : may ht^ 
utilized to halt PCR and generate .random polynucleotides. ,' , > ' % : :, ; '' / 

- • Figure 5 is a flow chart illustrates the steps involved in utilizing UV light' [p . 

create DNA adducts and hah PCR to generate randorn-poiynucicotides. - , 

: ..<■*.. ■ Figures CvY and . 6B . illustrate . the separation of polynucleotides- before' 
2C • assembly and the results after assembly, wherein Figure '6 A is directed to separation 
;V - ;bands of the prc-assembly polynucleotides and Figure 6B is directed in its lane one to 

illustrating separation- bands- of reassembled polynucleotides: after the first" round of'.. 
V . reassembly PCR and irijane two i 1 1 us tra t in g s e p a ra • i p h . b a n d s of reassembled polynucleo- 
tides after the second round of reassembly PCR.. Lajie-2 shows the complete,, reassert : 
25 bled random, polynucleotide ready for amplification, cloning and screening for a useful 
. utility. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Further advantages of the present invention will become apparent from the 
following description of the in vention with reference to the attached drawings. 

The present invention . relates to a method for nucleic acid molecule 
5 reassembly after producing random oligonucleotides via interrupted PCR, and optionally 
subjecting at least one of said random oligonucleotides to further PCR as templates to 
produce additional oligonucleotides, and the application of such reassembly to 
mutagenesis of DNA sequences. Also described is a method for the production of 
polynucleotides encoding mutant proteins having enhanced biological activity. In 
10 particular, the present invention also relates to a method of utilizing repeated cycles of 
mutagenesis, nucleic acid shuffling according to. the present invention sexual PCR 
oligonucleotide method and selection ^ which allow for the creation of mutant proteins 
having enhanced biological activity, ' . - 

The present invention is directed m 2 method for generating a. very large 
15 library of DNA, RNA or protein mutants. Th:s method has particular advantages in the 
generation of related polynucleotides from which the desired active polynucleotide 
portion(s) may be selected. In parti cdaMhe present invention also relates to a method 
of repeated cycles of mutagenesis, homologous recombination and selection which allow 
for the creation of mutar.i proteins having enhanced biological activity. 
20. .. For clarity and consistency-; 'the; following terms will be defined as utilized 

above, throughout this document and in ihe claims: 

Definitions 

The term "DNA reassembly" is used when recombination occurs between 
identical sequences. . 
25 By contrast, the term "DNA shuffling" is used herein to indicate recombina- 

tion between substantially homologous but non-identical sequences, in some embodi- 
ments DNA shuffling may involve crossover via non-homologous recombination, such 
as via-cer/lox and/or flp/frt systems and the like. 
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The term "amplification" means that the number of copies of a polynucleotide 
is increased. 

The term "identical'' or identity' 1 means that two nucleic acid sequences have 
the same sequence or a complementary sequence. Thus, "areas of identity" means that 
5 regions or areas of a polynucleotide or the overall polynucleotide are identical or 
complementary to areas of another polynucleotide or the polynucleotide. 

The term "corresponds to" is used herein to mean "that a polynucleotide 
sequence is homologous(i.c., is identical, not strictly evolutionarily related) to all or a 
portion of a reference polynucleotide sequence, or that a polypeptide sequence, is 
10 identical to a reference polypeptide sequence. In contradistinction, the'term "comple- 
mentary to" is used herein to mean that the complementary sequence is homologous to 
all or a portion of a reference polynucleotide sequence. For illustration, the nucleotide 
sequence "TATAC" corresponds to a reference "J AT AC" and is complementary to a 
reference sequence "GTATA." - 
15 The following terms are used to describe the sequence relationships between 

two or more polynucleotides: "reference sequence," "comparison window," "sequence 
" identity," "percentage of sequence identity," and "substantial identity. " A "reference 
sequence" is a defined sequence used as a basis for a sequence comparison; a reference ■ 
sequence may he a- subset of a largcr'scq i:en:e, for example, as a segment, of a rulUcrigth 
20 cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA 
or gene sequence. Generally, a reference sequence is at least 20 nucleotides in lerigth." 
frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. 
"Since two polynucleotides may each (1) comprise a sequence (.i.c:, a portion of the - 
.. complete polynucleotide- sequence) that is similar between the two polynucleotides- and- ~ 
25 (2) may further comprise a sequence that is divergent between the two polynucleotides, 
sequence comparisons between two (or more) polynucleotides are typically performed 
by comparing sequences of the two polynucleotides over a "comparison window" to 
identify and compare local regions of sequence similarity. 

• * A "comparison window," as used herein, refers to a conceptual segment of at 
30 least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be 



- WO 98/01581 



PCT/US97/12239 



- 17 - 

compared to a reference sequence of at least 20 contiguous-nucleotidcs and wherein the 
portion of the polynucleotide sequence in the comparison window. t may comprise 
additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference 
sequence (which does not comprise additions or deletions) for optimal alignment of the 
5 two sequences. Optimal alignment of sequences for aligning a comparison, window may 
be conducted by the local homology algorithm of Smith and Waterman (1981) Adv. 
Appl. Math. 2: 482 by the homology alignment algorithm of Needlemen and Wuncsch 
j Mol. Biol. 48: 443 (1970), by the search of similarity method of Pearson and Lip.man 
Pmc. Natl. Acad. Sci. fU.S.A.) 85: 2444 (1988), by computerized implementations of 

10 these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Generics ; 
Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, 
\V1). or by inspection, and the best alignment (i.e., resulting in the highest percentage of 
homology over the comparison window) generated by the various methods is selected. 

The term "sequence identity" means thai two polynucleotide sequences arc 

15 . identical (i.e., on a nuclco.tidc-by-nuclcctidc basis) over the window of comparison. The 
term "percentage of sequence identity" is calculated by comparing two optimally aligned . 
sequences over the window of comparison, determining the number of positions at which 
the identical nucleic acid base (e.g., A. T. C, G, U, or 1) occurs in both sequences to yield 
the number of matched positions, dividing the number of matched positions by 'the total 

20 number of positions in the window of comparison (i.e., the window size), and 
multiplying the result by 100 to yield the percentage of sequence identity. This 

■". "substantial identity" as used herein denotes. a. characteristic .of a .polynucleotide 
sequence, wherein the polynucleotide comprises a sequence having at least 80 percent 
sequence identity, preferably at least 85 percent identity, often 90 to 95 percent sequence 

25 identity, and most commonly at least 99 percent sequence identity as compared to, a 
reference sequence of a comparison window of at least 25-50 nucleotides, wherein the 
percentage of sequence identity is calculated by comparing the reference sequence to the 
polynucleotide sequence which may include deletions or additions which total 20 percent 
or liss of the reference sequence over the window of comparison. ; 
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\ \ ' "Conservative 'amino acid substitutions"; refer to the interchangeabiiity of 
residues having similar side chains. For example, a group of amino acids having 
aliphatic side chains is glycine/alanine, valine, leucine, and isoleucine; a group of amino 
acids -having aliphatic-hydrpxyl side chains is serine and threonine; a group of amino 
5 acids having.amide-containing side chains is asparaginc and glutamihe;a group of amino 
acids having aromatic side chains is phcnyialariine, tyrosine, and tryptophan; a group of 
, ; amino acids having basic side-chains is lysine, arginine/ and histidine; and a group of 
. - amino acids having sulfur-containing sidc ; chains is cysteine and methionine. Preferred > ; 
conservative - amino acids substitution -v. groups; are : vaiinc-Ieucinc-isoleucine, : 
10 phenylalanine-lyrosinejysine-arginine, alanine-valine, and asparagine-glutaminc. 

- Thc term "homologous" or "homeolog'ous" means that one.singlc-stranded 
. nucleic acid nucleic acid sequence -may hybridize'to a complementary sinele-stranded 
nucleic acid sequence. The degree of nybridizatjon- may depend on a.numbermf factors'- 
including- the amount of identity between, thc sequences and the hybridization conditions , 
,15 such as temperaturcand salt concentrations as discussed inter. Preferably the region of 
identity *is greater than about > bp, more, preferably the region of identity is greater than ' 

- ■• The term-- ."heterologous" means that one - single-stranded nucleic acid 
sequence -is- unable to hybridize to anoihcr.smgle-sirun'ded-nucleic acid 'sequence or its , 
20 .complement: -Thus areas of heterology, means thai areas', of polynucleotides, or 
polynucleotides have areas or regions, '.witniri their sequence which arc unable to 
' hybridize to another nucleic acid. or polynucleotide. ,'. Such regions' or areas are,, for 
- , ■ example areas of mutations. 1 v*:'; - • • • •* 

-. y > • ' T ^c term /'cognate" as used- Hercin refers to a- gene sequence that; i's : ■ * 
25 evolutionary and functionally related between species. Tor example but not limitation, - 
in the human genome the human GD4 gene is the cognate gene to thc mouse 3d4 gene, 
since the sequences and structures of these two genes indicate that they are highly 
homologous and both genes encode a protein- which functions in signaling T cell " 
activation through MHC class II-restricted antigen recognition. 
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Th e term "wild-type" means that the polynucleotide does not comprise any 
mutations. A "wild type" protein means that the protein will be active at a level- of 
activity found in nature and will comprise thc.amino acid sequence found in nature. 

The term "related polynucleotides" means that regions or areas of the 
5 polynucleotides are identical and regions or areas of the polynucleotides are. heterolo- 
gous. 

The term "chimeric polynucleotide" means that the polynucleotide comprises 
* regions which are wild -type and regions which are mutated. It may alsu meant the 
polynucleotide comprises wild-type regions form one polynucleotide and wild-type 
10 regions form another related polynucleotide. 

The term "cleaving" means digesting the polynucleotide with en2ymes or 
breaking the polynucleotide. 

The term "population" as used herein means a collection of components such - 
as polpuclcotides, portions or polynucleotides or proteins. A "mixed population: means 
15 a.colleetion.of components which belong to the same family of nucleic acids or proteins 
(i.e.. arc related) but which differ in their sequence (i.e., are not identical) and hence in 
their biological activity. 

The term "specific poipucieotide" means a polynucleotide having certain end 
points and having a certain nucleic acid sequence. Two polynucleotides wherein one 
20 polynucleotide has the identical sequence as a portion of the second polynucleotide but 
different ends comprises two different specific polynucleotides. " ' 

The term "mutations" means changes in the sequence of a wild-type nucleic 
acid sequence or changes in the sequence of a peptide. Such mutations may- be pint 
mutations such as transitions or -transversions. The mutations may be deletions, 
25 insertions or duplications/ '■ 

In the polypeptide notation used herein/the left-hand direction is the amino 
terminal direction and the right-hand" direction, is the carboxy-tcrminal direction, in 
accordance with standard usage- and convention. Similarly, unless specified otherwise, 
the left-hand end of single-stranded polynucieoti de sequences is the 5' end; the left-hand ." 
30 direction of double-stranded polynucleotide sequences is referred to as the 5' direction. 
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The direction of 5' to 3' addition of nascent RNA transcripts is referred to as the 
. transcription direction; sequence, regions on the DN A strand having the same sequence 
. as the RNA and which are 5'/ to the. 5' end of the RNA transcript are referred to as 
"upstream sequences"; sequence regions on the DNA v strand having the same sequence 
5 as.theiRNA and which are 3' to the 3' end of the coding RNA transcript are referred to 
'as "downstream sequences". ' ' * • 
. -. r . , • Thetcrm "naturally-occurring" as. used herein as applied to the object refers 
•> to the fact that an object can be found in nature. For example, a polypeptide or 
polynucleotide sequence that is;prescnt in an organism (including viruses)' that can be 
10 Isolated from a source in nature- and which has not been intentionally modified by man 
, ■■: in the laboratory is naturally occurring. Generally, the terrrr naturally occurring refers 
to an object as-present in a non-pathological : (un-diseased) individual, such as would be 
' typical for the species. • • . 

The term "agent" is used hereirvio.denote a eiicmi'cal compound, a mixture 
15 of chemical compounds, ar, array of spatially beaded compounds (e.g., a VLSIPS 
peptide array, ^poj)Tiuclcotide array, -and/or c'cm'oinaiorial-, small ■moicculc array), " 
■biological macronioleculc, a bacteriophage peptide display library, a bacteriophage 
antibody (e.g., scFv) display library, a'.rolysome. peptide display library;, or/an extract ' 
made -form ■biological .materials such as bacteria,, plants, fungi, or animal (particular 
20; : mammalian)' -cells or tissues. Agents -are .evaluated lor potential activity as -anti- ' 
.neopl^iics, antirinfiamm inclusion: in -screening assays' 

described hereinbelow. Agents are evaluated. for potential activity as specific protein 
; interaction- inhibitors (i.e., an ; agcnt which; selectively inhibits a binding interaction ■ 
between two predetermined polypeptides ..but- which doe snot substantially interfere with 
25 cell viability) by , inclusion in screening assays described hereinbelow. * 

•As used herein, "substantially pure" means an object species .is the 
predominant species present (i.e., on a molar basis it'is more abundant than any other 
individual macromolecular. species in the composition), and preferably substantially 
purified faction is a composition wherein the object species.comprises at least about 50 
30 percent (on a molar basis) of all macromolecular species present. Generally, a 
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substantially pure composition will comprise more than about 80 to 90 percent of all 
- macromolccular species present in the composition.. Most preferably, the object species 
is purified to essential homogeneity (contaminant species cannot be detected in the 
composition by conventional detection methods) wherein the composition consists 
5' essentially of a single macromolecular species. Solvent species, small molecules (<500 
Daltons), and elemental ion species are not considered macromolecular species. 

As used herein the term "physiological conditions" refers, to temperature, pH. 
ionic strength, viscosity, and like biochemical parameters which arc compatible with a 
viable organism, and/or which typically exist intracellular^ in a viable cultured yeast cell 
10 or mammalian cell. For example, the intracellular conditions in a yeast cell grown under 
typical laboratory culture conditions arc physiological conditions. Suitable in vitro 
reaction conditions for in vitro transcription cocktails, arc generally physiological 
conditions.- In general, in viiro physiological conditions comprise 50-200 mM NaC! or 
KC1, pH 6.5-3.5, 20-45°C and 0. 001-30 mM divalent cation (e.g., Mg*\ GT ); 
15 preferably about 150 mM NaCl or KC1, pi i 7.2-.6. 5 mM divalent cation, and often - 
include 0.01-1.0 percent nonspecific. protein (e.g., BSA): A non-ionic detergent (Twcen. 
NiMO, Triton X-100) can often be present, usually at about 0.001 to 2%, typically 0.05- 
0.2%'(v/v). Particular aqueous conditions m?.y be selected by the practitioner according 
to conventional methods. For general guidance, the following buffered aqueous 
20 conditions may be applicable: • .1 0-250 mM NaG, 5-50 mM Tris HCi, pH 5-8, with 
optional addition of divalent caiion(s) and/or metal chelators and/or non-ionic detergents 
and 'or membrane fractions and/or nr.ti- foam agents and/or scintillants. 

"Specific hybridization" is defined herein as the formation of hybrids between 
a first polynucleotide and^a second polynucleotide (e.g., a polynucleotide having a 
25 distinct but substantially identical sequence to the first polynucleotide), wherein 
substantially unrelated polynucleotide sequences do not form hybrids in the mixture. 

As used herein, the term "single-chain antibody" refers to a polypeptide 
comprising a V H domain and a \{ domain in polypeptide linkage, generally liked via a 
spacer peptide (e.g., [Gly-Gly-Gly-Gly-Serjj, and which may comprise additional amino 
30 acid sequences at the amino- and/or carboxy- termini. For example, a single-chain' 
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antibody may comprise a tether segment forlinkirig to the encoding polynucleotide. As 
an example, a scFv is a single-chain antibody. Single-chain antibodies are generally 
■ proteins consisting' of one or more polypeptide segments of at least 10 contiguous amino 
substantially encoded by genes of the immunoglobulin supcrfamily (e.g., see The * 
5 Immunoglobulin Gene Suberfamilv ; A.F. Williams and A.N. Barclay, in Immunoglobulin 
Genes . T. Honjo, F.W. Alt, and THE. Rabbits, ed.s., (1 989) Academic press: San Diego, 
/ - CA, pp. 361-368, which is incorporated hcrein'by reference), most frequently encoded 
by a rodent, non-human primate, avian, porcine bovine, ovine, goat, or human heavy 
chain or light 1 chain gene sequence. A functional single-chain antibody generally 
10- contains ^sufficient portion of an immunoglobulin superfamily gene product so as to 
. retain the property of binding to a specific target molecule, typically a receptor or antigen 
(epitope). . v .. • < -• ' . ■ - ' ' ■ 

, , v As used herein, the term ''compjemen'arity-iietermining region" and M CDR M 
refer to the art-recognized term ns exemplified by the Kabat and ChoVhia CDR definitions 
15 also general); known as supervariable. regions or hypenariablcloops (Chothia and Leks ' 
; (1 987) J. Mol. Biol. 196; 90 1 ; Clothia ct al.. (i9S9) Natisre. 342; 877; E.A. Ka'bat e: a!., ' 
■ Sequences of Proteins of Immunological Interest (national Institutes of Health, Bethesda, 
MD) (1987); and Tramontano et a!.- (1990) J. Mo!. Bioiog: 215 ; 175). .Variable region 
domains typically comprise the ammo-terminal approximately 105- 1 1 5 amino acids of 
20 a namrally-pccurring immunoglobulin xhai:y(l\g. . amino- acids 1 -.1 TO), although variable ' 
domains somewhat shorter or -longer are' - .also, /-suitable \ for forming single-chain 
antibodies'. " ' '- ' 1 ' ' ; 

. An immunoglobulin light or heavy chain variable region consists of a 

"framework" region interrupted by threc.hypervariable regions, also called CDR's. The , .. 
25 extent of the framework region and' CDR's have been precisely defined (see, "Sequences . * 
of Proteins of Immunological Interest/' E. Kabat ^ a_L 4th Ed., U.S. Department of 
Health and human sen/ices, Bethesda, MD (1937)). The scquences'of the framework 
regions of different light or heavy chains are relatively conserved within a specie. As 
used herein, a "human framework region" is a framework region that is substantially 
30 identical (about 85 or. more, usually 90-95 or more) to the framework region of a 
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naturally occurring human immunoglobulin, the framework region of an antibody, that 
is the combined framework regions of the constituent lighted heavy, chains, serves -to- • 
position and' align the CDR's. ' The CDR's are primarily responsible for binding to an 

epitope of an antigen. •' , ' 

5. As used herein, the term "variable segmenprefcrs to a portion of a. nascent 

■ -.peptide which comprises a. random, pseudorandom, or defined kemal sequence. A 
- ' variable segment" refers to a portion of a nascent peptide which comprises a random . 

pseudorandom, or defined kemal sequence. A variable segment can comprise both 
' variant and invariant residue positions, and the degree of residue variation at a variant 
10 residue position may be limited: both options are selected at the discretion of the. 
". practitioner. Typically, variable segments arc about S .to 20 amino acid residues in length ' 

■ (e.g.V'8 to 10), although variable segments may be longer and may comprise antibody 
/'. portions or. receptor proteins, such as an antibody fragment, a. nucleic, acid binding 

• - protein, a' receptor protein, and the like. > ■ . - ; 

15 ' ' As used herein/random p^idc sequence" refers to ah amino acid sequence 

■ composed of two or more amino acid monomers and constructed 'by a stochastic or 
rar.dom process.' A random peptide can include framework or. scaffolding-motifs,- which 

. may'comprise invariant. sequences. . . ■ : ■ . ' " ' . 

. ' - As used herein 'random peptide library'" rcfsrs t0 a sctof polynucleotide 
20 ' 'sequences that encodes -a, set oY rar.dom "peptides,: and_.to.the set of random peptides 
■ encoded by-thosc .polynucleotide sequences; as well as the fusion proteins contain those- 

random peptides. •■ ; ■ ■ ■ . ■ ' ■ " ' '\ 

As ..used herein, the. term "p5Cudorandom"-refers to a: set of sequences nh-at 
have li mited variability, sot thai for example the degree of residue variability at another . 
25 ' position, but any-pseudorandom position is allowed some degree; of- resid/ue, variation, 
however circumscribed. • ' V . ' 

' •. As used herein, the term "defined sequence framework" refers to a. set- of 
defined scquences'thai are selected on a nor.-random basis, generally on. the basis of 
experimental data or structural data;, for example, a defined sequence framework may 
30 comprise a sct'of amino acid sequences that are predicted to form a B-shect structure or 
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may comprise a leucine zipper heptad repeat motif.a zinc-finger domain, ani ong other 
variations. A "defined sequence kerna!? is a set of sequences which encompass a limited 
' . r scope of variability..' Whereas (1) a completely random 10-mcr sequence of the 20 

conventional amino acids can be any of (20)'° sequences, and (2) a pseudorandom ,10-. '" 
5 mer sequence of the 20 conventional amino acids can be-any of (20)'° sequences but will 
exhibit. a bias for certain residues at certain positions' and/or overall, (3) a defined.; 
• sequence kema) is a.subset of sequences if each residue position was allowed to beany - 
. . of the allowable 20 conventional amino acids .(and/or allowable unconventional '■ 
amino'imino acids). A defined sequence kemal generally comprises variant and 
.10 invariant residue .positions and/or comprises variant residue' positions which can . 
. comprise a residue selected from a defined subset of arrsino acid residues), and the like, 
, • either scgmentally or over the entire, length of the 'individual selected library', member-. ' 
sequence. Denned sequence kernels can refer to either amino acid sequences or. 
pol>7TJclcotide.sequcnccs. 'Of iHusiration'and no- limitation, the sequences (NNK), 0 and ' 
.15... (NNM).. ; . wherein N represents A, T, G, or C;. K- r -prc.s-r.ts'G or.T; and M represents A '. ' 
.' „ or O, are defined ..sequence kernels. . ■• 

, , : • As 'used 'herein "epitope" refers .to thai portion of *n antigen , of. other ' ■ 
■ . rpacromolcculs capable of forming a. binding interaction lhat.inicracls.with the variable' ■• 
region binding body of an antibody. Typically; s'uch'Bmding. interaction is manifested • 
20 as ; numicrmolecular contact \^ : th,onc of.r^or'c am:no.acid Vssidues of a.,CDR. '"' : ' -' ; 

As used herein, ."receptor" refers to/a molecule that has an affinity tor a giyen- 
.Irgand. Receptorsxan be naturally occurring or synthetic -molecules. Receptor* can be •' . 
employed in an unaltered state or. as aggregates with other species: Receptors can- be - 
attached, covalently or non-covalenlly, to.a binding member, either directly or via a 
25 .specific binding substance- Examples of receptors include, but are not limited. to, •'..- 
antibodies, including monoclonal antibodies and an-tisera reactive with specific antigenic '' ' 
determinants (such' as On viruses, cells, or other materials),, cell- membrane receptors", 
complex carbohydrates and glycoproteins, enzymes, and hormone receptors. 

.As used herein ."ligand" refers to a -r.ileeuie, such as a random peptide or 
30 variable segment sequence, that is recognized by a particular receptor. As one of skill 
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in the art will recognize, a molecule -(or macromolecular complex) can be both a receptor 
, and a ligand. In/general, the binding partner having a smaller molecular weight is 
referred'to as the iigand and the binding partner haying a greater.molecular weight is ; . 
referred to as a receptor. . 
5 As used herein, "linker" or "spacer" refers to a molecule or . group of 

molecules that connects two molecules, such.as a DNA binding projein and a random 
peptide, and serves to place. the two molecules in a. preferred configuration, e.g., so that 
the random peptide can bind to a receptor with minimal steric hindrance from the DNA - 
binding protein. . 

10 As used herein, . the term "operably linked" refers to a linkage of polynucleo, 

tide elements. in a functional relationship, A nucleic acid is "operably linked" when it is 
. placed into a functional, relationship with. another nucleic acid sequence.. -For instance,; 
• a promoter or enhancer is operably linked to- .a coding sequence- if it affects the 
( ; transcription of the coding sequence. Operably linked means that the DNA sequences 
15 being linked.are typically contiguous and, where necessary to join two protein coding 

' rontons, contiguous apd in- reading frame. ■ • 

,' • ' As used herein., the- "means for slowing or halting the PCR amplification . 

process" is.defmed.-as'.uniiza-aon of tV ligh; or a DNA adduct to slow or halt the PCIV 
amplification of at least onepolyivucleotide. Preferably; such a means is either UV light 
20 or a DNA adduct which is.a member selected from the.group consisting of: (t)-CC- 
. 1065, or a synthetic analog 'such as (-)-CC-l 065-^3-Adenine)„ (see. Biochem. 3J., 

2S 22"-"2 S29 " ( 1 992));" a" N -ac yraVed"- o-r deacctylatcd 4-nuro^-ami.nobiphcnyl. adduct - - 

• capable of inhibiting DNA synthesis (see, for 'example. Carcinogenesis vol. 13. No. 5, 
• . 751-758 (.1992); or a N-acetylated of dcacetvlated 4-arri'inobiphenyl. adduct capable of 
25 " inhibiting DNA' synthesis (sec also, 14- 751-758); irivalent chromium, a trivalent- ■ 
: chromium salt, a polycyclic aromatic hydrocaYbon ('.'PAH") DNA adduct capable, of ; 
...inhibiting DNA replication; such as 7-bromorr.et-hyl-benz[a]anthracene ("BMA;'),. 

'tris(2^dibromopropyl)phosphaie (;"Tris-BP."),.l .2-dibrorno-3-chloropropane ("DBCP"), . 
.. 2-bromoacro!ein (2BA),- benzo[a]p>Tsne-7,8-dihydrodiol-9-10-cpoxide •("BPDE"), a 
30 plat-inum(II) halogen salt, N-hydroxy-2-amino-?-methylimidazo[4,5-/]-quinolinc ("N* 
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hydroxy-IQ H ), and N-hydroxy-2-amino-l-me^ ("N- 
hydroxy-PhlP"). Especially preferred "means. for slowing or halting PCR amplification 
"consist of UV light' (+)-CC- 1 065 l and (+)-CC-l 065-(N3-Adenine). : Particularly 
encompassed means arc DNA adducts br polynucleotides comprising (he DNA adduces 
5 f from the polynucleotides or polynucleotides pool, which can be released or removed by 
1 a process including heating the solution comprising the polynucleotides prior to further „ 
processing, ; - • ■ \ : ' , > l 

Methodology \ , " ' i ' - . 

! Nucleic acid shuffling is a method" for in' vitro or in vivo homologous ' 
10 recombination of pools of shorter or smaller polynucleotides to produce a. polynucleotide • 
or polynucleotides. Mixtures of. related nucleic, ■ aui d sequences orpolynucleolidcs are 
subjected to sexual PCR to provide random ■polynucleotides, and reassembled to yield 
a library* or mixed-population of recombinan: mutanr'nucicic. acid m.piecules or 
' v polynucleotides. t . . ' ' 

15 : In contrast to cassette mutagencsis/cniy shufTiiiTg and enqr-prone PGR allovv ' 

one .to -mutate a' pool of sequences' blindly (without sequence informauon other than-. 
. t primers). ; ( - i ' • , , .. ■ ■-. • ; .; 

- <' The advantage of the mutagenic shuffling "of. this invention over crtor-pronc .■ 
.PCR. alone for repeated. selection can best be explained w;ih an example. from antibody 
20- engineering. ' In Figure' 1- is- shown a prior an 'schematic diagram of DNA\shuffling as 
compared \yith"eiTor-pr6nc PCR (not sexual^ PCR). The initial library of selected pooled- 
. *; ' sequences carv consist of related sequences-of diverse origin (i.e. antibodies. from naive : 
" mRTs'A)or can be derivedyby any type of mutagenesis (including shuffling) of a single 
antibody 'gene. A collection of selected complementarity determining regions ("CDRs") 
25 is obtained after the first round of affinity selection (Fig. 1). " In the diagram the thick * 
CDRs confer onto the antibody molecule increased affinity for the antigen; Shuffling 
allows the free combinatorial association of all of the CDRls with all of the CDR2s with 
all oftheCDR3s,etc. 
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This method differs from error-prone FCR; in that it is an inverse chain 

reaction. In error-prone PCR, the number of polymerase start sites and the number of 

molecules grows exponentially. However, the sequence of the polymerase start sites and 

the sequence of the molecules remains essentially the same. In contrast, in nucleic acid 

5 reassembly or shuffling of random polynucleotides the number of start sites and the 

number (but not size) of the random polynucleotides decreases over time. For 

polynucleotides denved from whole plasmids the theoretical endpoint is a single, large 

concatemeric molecule. - .,- 
Since cross-overs occur at regions of homology, recombination will primarily 

10 occur between members ofthe same sequence Tamily. This discourages combinations 
of CDRs that are grossly. incompatible (e.g., directed against different epitopes ofthe 
same antigen). It is contemplated that multiple families of sequences can be shuffled in 
the same reaction. Further, shuffling genera!:/ conserves the relative order, such thai, 
for example, CDR1 will not be found in the position of CDR2. 
15 . Rare shuf Hants will contain a !ar b - number ofthe best (eg/highest affinity) 

CDRs and these rare shufflants may be sclccter! based on their superior affinity (Fig. 1 ). 
CDRs from a pool of 1 00 different selected arvJcodv sequences can be permutatcd in up 
, to 1006 different ways. This farge number of permutations cannot be represented in a. 
single library of DNA sequences. According!/, it is'contcmplated that multiple cycles 
20 of DNA shuffling xnd selection may be requires! depending on the length of the. sequence 

_ and .the sequence diversify- desired. ^ ^ j ; _ 

Error-prone PCR, in contrast, keeps ?M the selected CDRs in the same relative 
sequence (Fig. 1), generating a much smaller mutant cloud. 

The template polynucleotide which nay be used in the methods of this- 
25 invention may be DNA or .RNA. It may be of various lengths depending on the size of 
the gene or shorter or smaller polynucleotide to be rccombincd or reassembled. 
Preferably, the template polynucleotide is from 50 bp to 50 kb. It is contemplated that 
entire vectors containing the nucleic acid enco ding the protein of interest can be used in 
the methods of this invention, and in fact have been successfully used. 
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The template polynucleotide may be obtained by amplification using the PCR 
reaction (U S. Patent No. 4,683,202 and 4,683,195) or other amplification or cloning 
methods. However, the removal of free primers from the PCR products before subjecting , 
them to pooling of the PCR products and sexual PCR may provide more efficient results. 
5 -Failure to' adequately remove the primers from the original pool before sexual PCR can 
lead to a low frequency of crossover clones. 
' \ ; ~ : " •• The'" template polynucleotide often should be double-stranded. ; A'dou-/ 
. ' ble-stranded nucleic acid molecule is recommended to ensure that regions of the' resulting' 
' single-stranded polynucleotides are complementary -to each other and thus can hybridize 
10 ; to form a double-stranded molecule. 

. It . is contemplated that 1 singie-sinsnded or double-stranded nucleic acid 
' ■ polynucleotides having regions of identity to the i-mplate polynucleotide and regions of. 
- heterology to the template polynucleotide may be added tolhe template polynucleotide,. s 
.at. this step. .It is also, contemplated that two different but related polynucleotide 1 ^ 
15 templates can be mixed at this seep. .. . ' " ~ 
■ - : * ' The double-stranded polynucleotide template 'and., any added double-or. 

sihglc.strandcd polynucleotides are subjected to sexual PCR which includes slowing or 
■, ■ halting to provide a mixture. of from abouV5 opio 5 kb or more. Preferably, the size of • 
.'■ the random polynucleotides isirom about") 0 bp to, 1000 bp/mof.c preferably the size o;f. ■ 
20 ,{hc polynucleotides is from about 20 bp teoOO bp. ' ' - \ - '< . ' 
: ' Alternatively, it is' also-contemplated that doubie-stranded nucleic acid having . 

multiple nicks may be used in the methods of this invention. A nick is a break in one. 
. V strand of the double-stranded nucleic^ acid. The distance between such nicks is preferably ' 

'5bpto 5 kb/more prefcrablybetween'l 0 bp to lOOO bp. This can provide areasof self- 
25 priming to produce shorter or smaller polynucleotides to be included with the 
' polynucleotides resulting from random primers, for example. 

Tne concentration of any one specific polynucleotide' will not be greater than .- 
1% by weight of the total polynucleotides, more preferably the concentration of any one 
specific nucleic acid sequence will not be greater than 0.1% by weight of the total nucleic 
30 acid. 
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: ■ The number of different specific polynucletides in the mixture will be at least . 
about 100, preferably at .east abour500. and.more preferably at least about 1000. . 

At' this step s.ngle-strandcd or double-stranded polynucleotides, either 
synthetic or natural may be added to the random double-stranded shorter or sma.ler - 
5 polynucleotides in order to increase the heterogeneity of the mixture of polynuclcot.des, 
. n is also contemplated mat populations of double.-stra.ded randomly broken 
polynucleotides may be.mixed orcombined at. this step with the po.ynuc.cotides from the 
' sexual PGR process and optionally subjected t, one or more additional sexual PCR . 

10 CyC ' eS ". Where insertion of mutations into the template polynucleotide ^.desired, : ' 
Sing ,c- S ,randed or. double-stranded polynucleotides" having a region of identity to the . 
template polvnucUotidc and a region of heterology to the template polynuc.coude may 

, . fT „i^ to the total nucleic acid, more ; 

be added in a 20 fold excess by wcign. a, .cm^.- to ^ 

„ni.-n-l-vi---,'^a.v be'addcd in a i.Q. fold, excess by . 
•preferably the single-stranded po'./nu.i-.-..-- ... . 

15 ' weight as' compared to the totaf nucleic a::d - ■ /. 

f u s. e H template polvnuclcotides is desired, 

Where a mixture ot. diucc." c- -u-f v . ... ^ ■ 

' r , i A.i^'^n- * o' the teraolates may be combined at a ratio 
populations of Dolv.nuclcotiaes ..qn— ^ .. • . 

f , k • '.,.,00 ™~ c-f^V the ratio is less than about 1:40. For example, 

of less than aoou: 1 . 1 Uu, itsoi- p.-> . . . , 

: ' A( -. v V , H -olvnucleDVide with a copulation of nutated polynucleotide 

a bnckcross of the v^ia-typ- r j.>.i^'^ * t t 

• r : , 0P =(c^ mutations yielding an insubstanti* , 

^0 nr; hc-desired to eliminate nv^ -5- • . ■ 

. u . t • selected for), in such an example, the ratio 

_ .-a-l-T-e-ra-i-ioa'in-thc oncno.iypi^ ^y?v^-:*:^„ L '. ......... . . _ 

of n ^.v p.ovidca ..;id-,p= po;,, J; ,o-.id,,««=h ™>- W ,dd=d ,o * 5 ™d 0m w 

'■ pr „v,d=d, ; ™ al PCR=ycl=™, m tp 0 l;x,,=l=oud : sis a p P rox™=,!v U ,o abou, IOC:',, 

" and more preferably irom-l:l to 40:.l. • v - '. • : 

25 • • The mi*ed population -of random polynucleotides are denatured -to form 

, , :.. _ H - ,: Vn -c-ahr.ealed. Only those single-stranded 

' 'single-stranded polynucleotides ar,^ t.^n .L r iu. j, , t . : 

' .. .■ „rK^'if : -h o-hcr single-stranded polynucleotides 

. polynucleotides having regions of horr.^c,,... - -f. o.n.r sm D p., . 

will'rc-anneal. . ' • • " ' ' . , 

The rar.dom „olynud=o,idcs «v.W ««*«* I* hfa«n 8 ,Onc SMlcdm ,h= 
30 an could define * condmons » —plccly dc„a,u,= .h= doublc-s.™ded 
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nucleic acid. Preferably the temperature is from 80 °C to 100 °C more preferably the 
temperature is from 90 °C to 96 °C other methods which may be used to denature the - 
polynucleotides include pressure (36) and pH.. [\. v 

The polynucleotides may be re-annealed by cooling., Preferably the 
5 temperature is from 20 °C to 75 °C, more preferably, the temperature is from 40 °C to 
65 °C. If a high frequency of crossovers is needed based on an average of on!y 4 4 
" consecuti ve bases of homology, recombination can be forced by using a low annealing 
* temperature, although the process becomes morcdifficuit: The degree of rcnaturation 
which occurs will depend on the degree of homology between -the population of 
10 / single-stranded polynucleotides. ' ' ; ' ' . , ; ■ . : 

■ * Renaturrilion can- be accelerated by the addition of polyethylene glycol 

. (.''PEG") or salt. The salt concentration is preferably fromO mM ip 200 mM, more 
preferably the sail concentration is from 10 mM to 100 mm. The salt may be KG or 
•NaCL The concentration of PEG is preferably -from.' QVs to 20%, more preferably from 

lo '5%"to;io%. ' ^ / 

• / . The annealed poiynucleotidcs.are r.ext incubated in the presence of a nucleic 
" acid polymerase and dNTP's (i.e. DATP, dCTP/DGTP and DTTP). The nucleic acid 
polymerase may be the' Kienow Tragmcn.i, -the Tad polymerase or any other DNA 
■polymerase knawn'in-the -art.- ' . ' * '-. , - -' ■ ' 

20 • The approach lo be; used for the assembly depends on thc^minimum degree . 

.of homology ■ that-shduld ■still yield crossovers. If the areas of identity are. large; Taq 
. polymerase* can be used with an annealing temperature of between 45-65 °C.' If the.arcas 
.. of identify are small, Klenow pol\Tneras'e can be used 1 with an annealing temperature/of = 
- • between 20-30 ,°C. One skilled in the art could van,' -he temperature of annealing to "•' 
25 increase the number of cross-o^ers achieved. - 

The polymerase may be added to the random polynucleotides prior to 
annealing, simultaneously with annealing or afver annealing. 

The cycle of denaturation, rcnaturaticr: and incubation in the presence of 
polymerase is referred to herein as shuffling or reassembly of the nucleic acid. This 
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' cycle is repeated for a desired number of times; Preferably the cycle is repeated from 2 
to 50 times, more preferably the sequence is repeated from 1 0 to 40 times: 

The resulting nucleic acid is a larger double-stranded polynucleotide of from 
about 50 bp to about 100 kb, preferably the larger polynucleotide is from 500 bp to 50 
5 kb. 

This larger polynucleotides may contain a number of copies of a polynucleo- 
tide having the same size as the template polynucleotide in tandem. This concatcmeric 
< polynucleotide is then denatured into single copies of the template polynucleotide. The 
result will be a population 'of polynucleotides of approximately the same size as the 
10 template polynucleotide. The population will be a mixed population where single- or 
double-stranded polynucleotides having an area of identity and an area of heterology 
'.. have been added to the template polynucleotide prior to shuffling. • 

These polynucleotides are then cloned into the appropriate vector ar.d the 
lieation mixture' used to transform bacteria. . 
15 It is contemplated that the single polynucleotides may be obtained from the 

larger concatemericpolyr.ucieotide by amplification of the single polynucleotide prior 
to cloning by a variety of methods including PCR (U.S. Patent No.. 4.683,195 and 
4.6S3 ; 202). rather than by digestion of the concntenier. 

' The vector used for cloning is no: critical provided that it will accept a 
20 polynucleotide of the desired size. If expression of the particular polynucleotide is 
■ ■ - dcsired.-ihe cloning vehiclc.sho.uid further comprise transcription and translation signals 
next to the site of insertion of the polynucleotide to allow expression of the polynucleo- 
tide in- the host cell. Preferred vectors include the pUC series and the pBR scries of 
plasmids. 

25 The resulting bacterial population will include a number of recombinant 

polynucleotides having random mutations. This nixed population may be tested to 
identify the desired' recombinant polynucleotides. The method of selection will depend 
on ^polynucleotide desired. 

For example, if a polynucleotide w hich encodes for a protein with increased 

30 binding efficiency to a ligand is desired, the proteins expressed by each of the portions 
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of the polynucleotides in the population or library may be tested for their ability to bind 
to the ligand by methods known in, the art (i.e. panning, affinity chromatography): If a 
polynucleotide which encodes for a protein with increased drug resistance is desired, the 
proteins expressed by each of the polynucleotides in the population or library may be 
5 tested for their ability to confer drug resistance to the host organism. One skilled in. the 
art, given knowledge of the desired protein, could readily test the population to identify 
polynucleotides which confer the desired properties onto the protein; - ■ ■ , ■ * 
^ ' It is contemplated that one 'skilled in the art.could use a phage display system 

in which fragments of the protein are expressed as fusion proteins on the phage surface 
10 ✓ (Pharmacia, Milwaukee WI). The recombinant DNA molecules .arc cloned into the * 
phage DNA' at a site which results in the transcription of a -fusion protein a portion of 
which is encoded by the recombinant- DNA molecule. The phage containing the 
recombinant nucleic acid molecule undergoes replication and transcription in the cell/ 
The leader sequence of the. fusion' protein directs the Transport of the fusion protein to the 
15 tip of the phage panicle. Thus the fusion -pro: em- which is partially encoded by the 
recombinant DNA molecule is displayed .on- the. phage particle for detection and selection 
by the methods described above. 

/ It is further contemplated that a number- of cycles of nucleic acid shuffling < 
may be conducted with polynucleotides, from a sub-population of the first population, 
20 which sub-population contains DNA encoding the desired recombinant protein. In this . 
manner, proteins with even higher binding affinities" or enzymatic activity, could be 
achieved. - ' ; ■ ' , : / ]■" ■ 

it is also contemplated that a number of cycles of nucleic acid . shuffling may 
. be conducted with a mixture of wild-type polynucleotides and a sut>population of 1 
25 nucleic acid from the first or subsequent rounds .of nucleic acid shuffling in, order to 
remove any silent mutations from the sub-population. 

Any source of nucleic acid, in purified form can be utilized as the starting 
nucleic acid. Thus the process may employ DNA or RNA including messenger RNA, 
which DNA or RNA may be single or double stranded. In addition, a DNA-RNA hybrid 
30 which contains one strand of each may be utilized. The nucleic acid sequence 
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may be of various lengths depending on the size of the nucleic acid sequence to be 
mutated. Preferably the specific nucleic acid sequence is from 50 to 50000 base pairs. 
It is contemplated that entire vectors containing the nucleic acid encoding the protein of 
. interest may be used in the methods of this inventibn. 
5 . The nucleic acid may be obtained from. any ■source, for example, from 

plasmids such a pBR322, from cloned DNA'or RNA or from natural DNA or RNA from 
' any source including bacteria, yeast,' viruses and higher organisms such as plants or 
animals. DNA or RNA may be extracted from blood or tissue material. The template 
polynucleotide may be obtained by amplification using the polynucleotide chain reaction 
10 (PCR) (US. Patent no,4.683,202 and 4,683,195). Alternatively, the polynucleotide may 

• be' present- i.n a vector present in a cell and sufficient nucleic acid may be obtained by 
cui curing the cell and extracting the nucleic acid frcm the cell by methods known in the 

art. ' • 

•An/specific nucleic acid sccucr.ee can be used to produce the population of 
15 'mutants by the present process. It is only necessary that a small: population of mutant 

• .' sequences of the specific nucleic acid sequence exist or be created prior to the present • 

process. . 

The initial small population of the specific nucleic-acid sequences having 
'. mutations may be created by a number of different methods. Mutations may be created 
20 by error-prone PCR! Error-prone PGR uses low-fidctiiy polymerization conditions lo 

introduce a iowleve'lo; point WSt'iohsYandornly overa long sequence.- Alternatively,- -■ 

mutations can be introduced into 'the template polynucleotide by bligonucleo- . 

• ■ tide-directed mutagenesis. In oligonucleotide-directcd mutagcnesis,a short sequence of ; , 
V the 'polynucleotide is removed from' ths polynucleotide using restriction- enzyme 

25 digestion and is replaced with a synthetic polynucleotide in which various bases have 
been altered from the original sequence. The polynucleotide/sequcnce can also bealtered 
- . by chemical mutagenesis. Chemical mutagens include • for example,, sodium bisulfite, . 
nitrous acid, hydroxyzine, hydrazine or formic acid, other agents which arc-analogues •'. 
of nucleotide precursors include nitrosoguanidinc, 5-bromouracil, 2-aminopurine, or - , 

30 acridine. Generally, these agents arc added to the PCR reaction in place of the nucleotide 
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precursor thereby mutating the sequence. Intercalating agents such as proflavine, 
acriflavine, quinacrine and the like can also be used. Random mutagenesis of the 
polynucleotide sequence can also be achieved by irradiation with X-rays or ultraviolet 
light. Generally, plasrnid polynucleotides so mutagenized are introduced into £ edit and 
5 propagated as a pool or library of mutant plasmids. 

Alternatively the small mixed population of specific nucleic acids may be 
found in nature in that they may consist of different alleles of the same gene or the same 
gene from different related species (i.e., cognate genes). Alternatively, they may be 
related DNA sequences found within one species, for example, the immunoglobulin 

10 genes. • < 

Once the mixed population of.lhc- specific nucleic acid- sequences is 
generated, the polynucleotides can be used directly or inserted into an appropriate 
cloning vector, using techniques well-known in. the an. 

The choice of vector depends on the size -of the polynucleotide; sequence and 

15 the host cell to be employed in- the methods of this invention. The templates of this 
invention may be plasmids, phages, cosmids.. phngemids, viruses (e.g.,* retroviruses, 
para influenza virus, hcrpesvipjscs, rcovipjscs, paramyxoviruses,; and trve like), or selected 
portions thereof (e. g., coat protein, spike glycoprotein, capsid protein): For example, 
cosmids and phagemids are preferred where the specific nucleic acid sequence to be 

20 mutated is larger because these vectors are able to stably propagate large polynucleo- 
tides. 

If the mixed, population of the specific nucleic acid sequence is cloned into 
a vector it can be clonally amplified by. inserting each vector into a-host cell, and allowing 
the host cell to amplify' the vector. Tnis is' referred. to as clonal amplification because 
25 while the absolute number of nucleic acid sequences increases, the number of mutants 
docs not increase. Utility can be readily determined by screening expressed polypep- 
tides. ' * ' ' 

The DNA shuffling method of this invention can bc.performed blindly on a 
pool of unknown sequences. By adding to the reassembly mixture oligonucleotides (with 
30 ends that are homologous to the sequences being reassembled)-any. sequence mixture can 
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■b* incorporated at any specific position into' another sequence mixture. Thus, it is 
:,ontern^^ 

^legenes can be mixed into another se.uence library ,t defihed pos.uons. The. ; 

: ^ 

5 ...othcrpanof^^^ 
^thedfvcrsityof^^^ 

leneth of the reassembled DNA. " '■ : ~\;'/; , r 

' : This .p^ of mW „ s r o ^ may b. »m * . 

■ ^hodics from murine ***** % "» **» 

»*. to*™- 7 a,so bt 

: - „s«nuc,eicacid W 

' ■ ^ Se S to,d. 1& e proran W - siiitaMs.for Mte . 

• • ' ■" ■' • ;',v^. .v-iip" b'v' self-association, .while displaying 

/ cons^-ed scaffold determines ,ir.e overa,, ,o,J.r. b D. s.u , ^ 

• .'h,V n --4i^-Vne. specific binding. Examples of such : 

rclativelv unrestricted loopy tha-P-.-'— v . ■ , 

• ' . . '. nin c, A i. uVpiH^.i and the tour-helix bundle Wnich are well-: 

s-* Folds are tne imnuooglobul.w.b.i- ^..i, ^ , . .. _ . 

' : •"■ . :"• -rw 5n ,*V- canbe used to create scaffQld-iike proteins wvthvaHOUS 
20'' kr.ov.-n in t-nc an. inis sn-...i... b ■ ; • ■ .,- 

• . . • "f"_, ™ t /».-' c-vvTccs'ior binding. • - ' ... '■ ... .' '■ 

•combinations of rnutatcu s.o— ;. w -. s & • . . •- 
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..' . • ... ' • ' ' ir' Viim Snuffling ■' ;;' v ■■;„ ■' ' ' 

. - :l:, .• n , t WaleM 5 of,som^anda^ 

shufllmg/rtW/ra For example molecular b a ckctps> c P ... 

,5 mixing ,he muran.'s nucie.c acid «i,h «* *» ** "5-"» 

' " mura.ions of inieres. As in uadUionaf 'breeding.. .h,s approach can he u S ed rocombmc 
phenols from differ, sources inro a background of choice, „ is uscfu,. for «~*. 
f0 ; ,he removal of ncu.ra. nations ,ha, affec. unse.ec.ed cha,ac,er,»s (it 
immunoacnici.y). Thus U car, be useM .0 dcrcrmme which motions in a pro.cn « 
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involved in the enhanced biological activity and which are not/ an advantage which 
-xaniiot be achieved by error-prone mutagenesis or cassette mutagenesis methods. ^ 

Large, functional genes can be assembled correctly from a mixture of small 
random polynucleotides. This. reaction may be of use for' the reassembly of genes from 
5 the highly fragmented DNA of fossils. In addition random nucleic acid fragments from 
fossils may be combined with polynucleotides, from similar genes from related species. 

It is also contemplated that the method of this invention can be used for the 
in vitro amplification of a whole genome from a single cell as is needed for a variety of 
research-and diagnostic applications. . DNA amplification by PGR is in practice limited 
?■ lo a length of about 40 kb. Amplification of a whole genome such as that of £ coli (5/ 
000 kb) by PCR would require about 250 primers yielding 125 forty kb polynucleotides. 
This approach isnot practical due to the unavailability of sufficient sequence data. On 
. the other, hand, random production of polynucleotides oflhe genome wiuYsexual PCR 

cycles, followed by gel purification of small. polynuckbtidcs will provide a multitude of.-. 
> possible primers. Use of this mix of random small polynucleotides as primers in a PCR 
reaction alone or with the whole genome as the template should result in. an inverse chain- 
reaction with the theoretical endpoint of a single concatcmer.containing many copies of 
.. the genome. / ' ' ' 

100 fold amplification in the copy number and an average polynucleotide size 
of greater than 50' kb may be obtained when only random polynucleotides 'are used, li 
J.s. thought thauhjj^ger.cor^^ . 
' leotides. Trie quality of specific PCR products obtained using synihciicprimcrs. will be 
indistinguishablc-from the "product obtainedrrom unampliried DNA. It is expected' thi 
this approach will be useful' for the mapping of genonies. - ' - 
The polynucleotide to be shuffled' can be" produced as random or non-random 
polynucleotides, at the discretion of the practitioner. > * 



In Vivo Shuffling " - : ' \ • . 

'In an embodiment of in vivo shuffling, the mixed population of the specific 
c acid sequence is introduced into bacterial or cukaryotic cells under'conditions 
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such that at least.two different nucleic acid sequences are present in each host cell. The 
;.. polynucleotides can be introduced into the host cells by a variety of different methods. 

■ The host cells can be transformed with the smaller polynucleotides using methods known 
in the art, for example treatment with calcium chloride- If the polynucleotides are . 

5 inserted into a phage genome, the host cell can be transfected with the recombinant phage 

• genome having the specific nuo.eic acid sequences. Alternatively, the nucleic acd 

.- sequences can be introduced into, the host .ell using e.ectroporation, transection, ; 

• lipofection, biolistics, conjugation, and the like. 

'in general, in this embodiment, the specific nucleic acids sequences wll be 
,0 • present in vectors which are capable of stably replicating the sequence in the host cell, - 
' m acld ition, it is contemplated that the vectors will encode a marker gene such that ost 
cClshaving the vector canbesdected. This ensures that. the mutated specific nucle.c . 

acid sequence d be recovered; after introduction into the host cell. However. ,t ,s 
eonte.pla.ecthatthecnt.remixedp.pulati^.or^speeificnucleicacidseauencesn^ 

,5 not be orescnt on a vector sequence. Rather only a sufficient number of sequences need 
• be cloned into vectors to ensure that after introduction of the polynucleotides into the ... 
Yost cel.s'each host cell contains one vector having a, least one specific nuc.ee acd . 
sequence present therein. . it is also contemplated that rather than having a subset of the ' 

• ' ' ,. •• r „„ r i„:. ^-^nt-nces cloned into vectors, this subset may 
•population- of tne specific nuclei ac.u* s-qi~nc^ u« 

?0 be alrcadv stably integrated info the host cel.. 

. ■ ' U has. beemfound that when two polynucleotides which have reg.ons of • 
id-ntitv are inserted into the host cells homologous rccombination.occurs between the 

• two polynucleotide, Such recombination between the two mutated specific nucleoid 

• ' - sequences will result in- the production of double or triple, mutants in some situations. ; 
25 It has also been found that the frequency of recombination ,s increased ,f 

■ some of the mutated specific nucleic acid sequences are present on linear nucleic acid ' 
molecules. Therefore, in a preferred embodiment, some of the specific nucleic acid 
sequences are present on linear polynucleotides. 

• After transformation, the host cell transforms are placed under selecuon 
30 to identify those host cell transforms which contain mutated specific nucleic acid 
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sequences having the qualifies desired.' For example, if increased resistance to a 
particular drug is desired then the transformed host cells may be subjected to increased 
concentrations of the particular drug and those transformants producing mutated proteins 
able to confer.increased drug resistance will be selected. If the enhanced ability of a 
5 particular protein to bind to a receptor is desired, then expression of the protein can be 
induced from the transformants and the resulting protein assayed in a ligand binding ' 
assay by methods known in the art- to identify that subset of the mutated population 
which shows enhanced binding to the ligand. Alternatively, the protein can be expressed 
in another system to ensure proper processing. 
10 Once a subset of the first rccombined specific nucleic, acid sequences- 

(daughter sequences) having the desired characteristics arc .identified, they are then 
subject to a second round of recombination. 

In the second cycle .of recombination. the rccombined specific nucleic acid • 
sequences may be mixed with the original mutated specific nucleic acid sequences 
15 (parent sequences) and the cycle repeated ascribed above. 'In this way a set of second 
rccombined specific nucleic acids sequences car. be identified which have enhanced 
characteristics or encode for proteins having enhanced properties. This cycle can be 
repeated a number of limes as desired. 

his also contemplated that in the sccond-or subsequent recombination cycle, 
•° 3 b " 3ckc?6ss can bc ?=rformcd.' A molecular backcross can be performed by mixing the 
desired specific nucleic acid sequences with a large number of the wild-type .sequence,. 
- SU?h thal at , lcast one ^ild-type' nucleic acid sequence and a mutated nucleic acid 
sequence arc present in the same host cell after transformation. Recombination with the 
wild-type specific nucleic acid sequence will eliminate those neutral mutations that may 
5 affect unselected characteristics such as immunogenic*)- but not the selected characters 



tics. 



" ' v • In another embodiment of this invent! or., it is contemplated that during the 
first round a subset of the specific nucleic acid sequences can bc generated as smaller 
polynucleotides by slowing or halting their PCR amplication prior to introduction into 
0 the host cell. The size of the polynucleotides must be large enough , 0 contain some 
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« regions of identity with the: other sequences so as to homologous* recombine with the 
othersequences. ^ size of the polynucleotides will range from 0.03 to 100. Kb more 
preferably^ 0. 2 kb to 10 kb,!tis also, ontemp.ated thaunsubsequent rounds, all, 

.. 0 f*c S pec^ 
r0 und may be utilized to generate PGR po^ucleotides prior to introduction ,nto the host . 



5 

.,: cells 



The shirt* polvrtuclco.tde sequent^ qan^e S in8le. S ,rand=d or d ou . . 

• stranded they can b. denatured w,,h hcac. ch.cmica,s or en^es prior to .insertion tnto 

,0 host cell. The reaction conditions suitable for separatee • f 

are well, known in the art. ■;. . , , , j . , 

■ ' * : . . , ' .... b . ~ D ^ e d kdermilely.bsing.limued .only by • 

"The steps of this process can r^p^-^ •> . . 

► ' ' ' u- u^u^^-'c- -After a certain number of cycles, 

: . the number of possible mutants which dan be sc^.l- — r ; . 

' ■ u-- .-^ --.j-f,.-^»r rvcles are redundant. 

• all. possible mutams.wil.1 have-been ach.eved . , .. , 

■ •■ , .-„.,--: ..-^v nucleic acid is repeatedly • 

15 ■'■ . In an embodiment the same mu.a.e-. u,..?^ nJL,eK - - 

. ,■ c „i,:..^i rV.-' i ; n » desired characteristic, 

rccombincd and the r .csOlt.ngrccori) ? nvis s=!.. ; .^ ... .. . 

•' ^ ■ • ' , ■ n --i. a. noiu'-tlon of m\uated template nucleic ; 

.. 'Therefore, tnc mitiai pool u P^-r- . 

'• ■ ■ ' - ' ■, r.,.:_„ hri-ri- 1 such as 'he- 

■acid iscl 0 ned imaavcctorcapaD!co:.repl,-a.,.s. 1 ..aoaa„n- i S . ^ 

' u „ n icr^nable of autonomous replication in E. 

■ ! particular vector is not essential, so long a,. >^f- 0 ' , , .- 

■ ; ■ ■ , . .'■ ,.-. A ,^r'k H-* : -T--'d -to- allow the expression and . 

20 'coll/ In a-prcfcrrcd,embod ; mcr,t,me-.eaor„d-,.^.to .., , 

" .. ^ '. , ■ ^..^ w.. ,v mutated specific nucleic. acid' linked 1 to tne 

: - production of any protein. cr.LO.ed muui-. . . -. 

■ ' -. , ■ . •• nrr .-„ rT . d lh ,: tV vector contain a gene encoding for- a selectaolc 
vector. It is. also prc-ierrn- m-. i.u <—y ■■■ . 



marker.'. 



: T he population o'f vec.o.s |W °' ™'»«< ^letc ac,d, 

25 ■ sequence, isTntroduccd into ,h= E.'-co,, hps, eel's, -the vector nudeic aeid sequences 

mavbe introduced by transformation.t.ansfccticr, 0, infection in the case of phage. The . 
' ' concentration of vectors used ,o transit «* bacteria is such tha, a number of vectors 
. is introduced into each ceil. Once present h the cell, the efficiency of homologous 
recontbtnation is such that homolo E ous recombination occurs between the vartous 
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vectors. This results in the generation of mutants (daughters) having a combination of 
mutations which differ from the original parent mutated sequences. 

The host cells arc then clonally replicated and selected for the marker gene 
present on the vector. Only those cells having a piasmid will grow under the selection. 
5 The host cells which contain a vector are then tested for the presence of 

favorable mutations. Such testing may consist of placing the cells under selective 
pressure, for example, if the gene to be selected is an improved drug resistance gene. If 
the vecior allows expression of the protein encoded by .the mutated nucleic acid • 
sequence, then such selection may include allowing expression of the protein so encoded, 
10 isolation of the protein and testing of the protein to determine whether, for example, it-- 
binds with increased efficiency to the ligand of interest. . . 

Once a particular daughter mutated nuc leic acid sequence has been identified 
which cooks the desired characteristics, the nuclei; acid is isolated either already linked 
to the vector or separated from the vector. This nueieic acid is then mixed with the first 
15 or parent population of nucleic acids and the cyci'c is repeated. 

j> It" has been shown that by this method nucleic acid sequences having 
-enhanced desired properties can be selected., 

.In an alternate embodiment,- the fir?: generation of mutants are retained in the 
cells and the parental mutated sequences are added again to the cells. Accordingly,. the ' 
20 u:>\ cycle of Embodiment ! is conducted a: described above, However, after the 
daughter nucleic acid sequences are identified, the host cells containing these sequences 
are retained 7 ' ^ ~ ~ 

The parent mutated specific nucleic acid population, either as polynucleotides 
or cloned into the same vector is' introduced into the host ceils already containing the 
25 daughter nucleic acids. Recombination is allowed to-occur in the cells and the next 
generation of recombinants, or granddaughters are selected by the methods described 
above. 

This-cycle can be repeated a number of times until the nucleic acid or peptide 
having the-desired characteristics is obtained, h is contemplated that in subsequent 
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cycies, the popuia.ion of mutated sequences which are added ,o ,he pr.fe.ed mutants 
■may come from' .he parental mutants o, any subsequent genera.™. ■ . . 

I„ an alternative embodiment the invention provides a me,bod of conducng. 
a'-molecuiar- bactaoss of the obtained recombinant speciftc nueieic acid ,n order ,0 
5 eliminate any neutral mu.ati.ns. : Neutral mutations are those mutations which do no, 
confer onto the nueieic acid o, peptide the desired properties. Such mutations may 
■ -however confer oh, he „„c,eic acido, pepude-undesirabiecha^tenstics/ Aecrdtng^ 

- Hisd.sirabie.oeiimin.t.suchneutraimU.a.ions.Themethod^.hisinventionprov.de. 

- a means of doing so. . " . , , • . ' , 
■' • ,„ this embodiment, after the : mutan, ..nucleic acid, hav.ng .he destred. 

characteristics; is ob.a.ned bv the methods of ,k S embodiments, thcnuclcic acd.thc 

1S0 ' 2ll ' d " ■ • . . ' " ;.. ;.,,.nrt U ced into the host cell with a large • 
The nucleic acidor vector , 5 t. ; en ,r..-.-u-.^ ^ „ , 

••, • . TH . --'d o<" the mutant and the nucleic acid ' 

,15 excess of the wild-type nucle.c acid. The r,— J0 ' . .. 

,ii nu .^ 'o -combine The resulting recombinants are 

' -of the wild-type sequence are allov.ee ,o ...vC.Din ... • . 

: 1 • . . , ■ . W.™ - tb- rv"»v nucleic "acid. Only those recombinants , 
• placed under the same selections t,,_ m—n. ..j- . •. .. • ,• 

. ' ■ ■ ' ... ' . be sM-c'ed Any silent. mutations which 

' ; which retained the desired cnaracteruvi^ -m1. d, s.l.c. . > 

' ■ ■, ,,V '^ire-lc-3P~te-y : e3will be lost through recombination, witlv the. r 
do no; provide the desired ca-a-.s.. _ . . . . . 

„ hvi Th,c-l-nb-:c:ckd a number of times until, all of the silent . 
20 wild-type DNA: This cy.u — r. c. - r ., 

' mutations are eii'minaicd. . ' ... . '; " 

' ' u 'a n r .v-:---,lion en be used in a molecular backcross to- 
Thus the-mcthods ot tnu ,.iv.nuon.c>u u~ 

eliminate unnecessary or-silent mutations. ; , ' - . / ; 



■ . Utility ' ' 

w- ->.;on rW-hnd of f- is invention can be performed blindly 
25 Th- in vivo recombination m.v.noo o. i... b i"-<-' 

„„ a poo, of ***** —S or =,l=i=s of a speciftc p„l™uc,co,idc o, sequence. 
However, it is no, necessary .0 know the ac.ua! DNA or RHA sequence of the specfic 

polynucleotide. 
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The approach of using recombination within a mixed population of genes can 
be useful for the generation of any useful proteins, for example, interleukin I, antibodies, 
t PA, growth hormone, etc. This approach may be used to generate proteins having 
altered specificity or activity. The approach may also be useful for the generation of 

5 mutant nucleic acid sequences, for example, promoter regions, introns, exons, enhancer 
sequences, 31 untranslated- regions or 51 untranslated regions of genes. Thus this 
approach may be used to generate genes having increased rates of expression. This 
approach may also be useful -in the study of repetitive DNA sequences. Finally, this 
approach may be useful to mutate ribozymes or aptamcrs. 

0 * Scaffold-like regions separating regions of diversity in proteins may be. 

particularly suitable for the methods of Lhis invention. . The conserved scaffold 
determines the overall folding by self-association, while displaying relatively unrestricted 
loops that mediate the specific binding. Examples of such-scaffolds are the immunoglob- 
ulin .beta barrel, and the four-helix bundle. The methods of this invention can be used 

5 . to create -scaffold-like proteins with various combinations of -mutated sequences for 
■ binding. 

The equivalents of some standard genetic matings may also be performed by 
the methods of this invention. For example, a "molecular' backcross can be performed 
by repeated mixing of the mutant's nucleic acid with the wild-type nucleic acid while' 
) selecting for the mutations of interest. As in traditional breeding, this approach can be 
used to combine phenotypss from different' sources into a background of 'choice. It is 
useful, for example, for. the removal of neutral mutations that affect unselected 
characteristics (i.e. immunogenicity), TTius it can be useful to determine which mutations 
in a protein are involved. in the enhanced biological activity and which arc not. 

Peptide Display Methods 
The present method can be used to shuffle, by in vitro and/or in vivo 
recombination by- any of the disclosed methods, and in any combination, polynucleotide 
sequences selected by peptide display methods, wherein an associated polynucleotide 
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encodes* displayed peptide Which is screened for a phenotypc.^.g.. for affinity for a 

. predetermined receptor (ligand). 1 , ' - 

• < : An increasingly impbrtantaspeet of biolpharmaceutical drug development- ., 
and molecular biology is theidentif.cation of peptide structures, including the primary 
5 - arnW acid sequences, of peptides 'or peptidomimetics- that, interact with biolog.cal , 
' V : m.cromolecules. on.methorf.of .dentifying peptides that possess a desired structure or 

V - a receptor); involves the, screening of a large library or peptides for individual hbrary ; 
- m ernber S which possess- the desired struct, or functional property conferred by the 

Vo ' amino acid sequence of the peptide. : ■'• : r , . . - ;,A . , - ■ 

' ■ ' 'I,, addition .to direct chemical .syndesis methods for generating peptide 

: ' v . nMA n^ihods a'so' have been reported. One type involves. 

. libraries, several recombinant UNA niclnoa^.^ i,- ... v , . , _ . 

' V ■' ' ' , • ' ' r' ■ „„,; P -^ antiho' ; v ' or other protein, on the surface of a; : 

the display of -a peptide, scque,..-. an.i..c. , v., ■ _ , ... 

' " u- i„ JVHI Gc-rallv 'in thi:.e methods each.bacteriophage particle 

• bacteriophage particle or ceil, oc.rciiy, , 

■ ". • -•■ ,• • 4 , : :u- nr . .•.;V:avir.c a single 'species' of displayed. 

15/ or cetr : ser%'es. as an individual hb.ao :-m--. . b » r 

' • :' • • , : ;u„ •• a .„rnl h-t-Vo-va.-e. o: cell protein sequences. Each 
peptide in. addition .to the natural c—ur.......-^ . v . , . 

.-. .' ■■ ■ '■ • ,,. - ; jc ; "--ic- sequence information encoding the 

bactcriconagc.or.ee!'. cental ..: J-,.-- •>-- «. . , 

'/. "pauicula/displaycd peptide sequent,; 'thus/ the displayed; peptide sequence, can be- 
" ^ascertained by nucleotide sequence d^iftation of an isolated:library member. . 

" • .• >j involve the- presentation, of.a peptide 

20 A well-known pcp.-ice or, >....< ,--- :r ^ •" - ^ . . .. . 

. , h of ^r.i^',r.-:ouS bacteriophage; typically as a fusion with a 

sequence on the suna— oi - ■ ,. ' 

baitenophie coat P :o:ein. .'Ti. ba;ve:iophage library can be incubated with f aa . 
• • immobi&d^redetermined mae:omoi,cuic orsmali nol.cule (e.g., a receptor) so that 
, - bacteriophage particle,which,preseet a peptide science that binds <o,he immobihzed, 
' 25 macromolecule car, be differentially panitior^from those that do not present pept.de 
- sequences that bind to,the predetermined macromolecule. The bacteriophage pamc.es 
(i e ' library members) which are bound to the immobilized macromolecule arc then 
recovered and replicated to amplify the select bacteriophage sub-population for a 
subsequent round of affinity enrichment and phage replication: After several rounds of 
30 affinity enrichment and phage replication, the bacteriophage library members that are 
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thus selected are isolated and the nucleotide sequence encoding the displayed peptide 
sequence is determined, thereby identifying the sequence(s) of peptides that bind to the 
predetermined macromolecule (e.g., receptor). Such methods are further described in 
PCT patent publication No's. 91/17271, 91/18980, and 91/19818 and 93/08278. 
5 The latter PCT publication describes a recombinant DNA method for the 

display of peptide ligands that involves the production of a library of fusion proteins with 
each fusion protein composed of a first polypeptide portion, typically comprising a 
variable sequence, that is available for potential binding to a predetermined macromole- 
cule, and a second polypeptide portion that binds to DNA, such as the DNA vector 
10 encoding the individual fusion protein. When transformed host cells are cultured under... 
conditions'that allow for expression of the fusion protein, the fusion protein binds to the 
DNA vector encoding it. Upon lysis of the host cell, the fusion protein/vector DNA 
complexes can be screened against a predetermined macromolecule in much the same 
way as bacteriophage panicles are screened in the phagc-bascd display system, with the 
15 replication arid sequencing of the DNA vectors in the selected fusion protein/vector DNA 
complexes serving as the basis for identification of the selected library peptide 
scquenec(s). * . 

Other systems for generating libraries of peptides and like polymers have 
aspects of both the recombinant and in vitro chemical synthesis methods. In these hybrid 
2U methods, ccll-frcc enzymatic machinery is employed. to' accomplish the in vitro synthesis ■ 
of the library members (i.e., peptides or polynucleotides), in one type of method, RNA 
molecules with the ability' lb bifid a predetermined protein or a predetermined dye 
molecule were selected by alternate rounds of selection and PCR amplification (Tuerk 
and Goid (1990) Science 249: 505; Ellington and Szostak (-19901 Nature 346: 818). A 
25 similar technique was used to identify DNA sequences which bind a predetermined . 
human transcription factor (Thiesen and Bach' (1990) Nucleic Acids Res . 18: 3203; ' 
Beaudry and Joyce (1 992) Science 257: 635: PCT patent publication Nos. 92/05258 and 
. 92/14843). In a similar fashion, the technique of in vitro translation has been used to 
synthesizc-protcins of interest and has been proposed as a method for generating large 
30 libraries of peptides. These methods which rely upon in vitro translation,- generally 
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: pub Uc a ,io„ N o, 88 /OS4 53 . 9 0/0» 8 MO/07003, 9 1 /0:07, 9 1 /050, 8 ,3nd.92/02 

■ frura jfc. ,, 5 anaino A Ijbrary call ^ . 

„; ale dl ^. f d^p.^^- ■^^ .'tS ' 

■ ,, J£ „cc,s, can be ^ *-* ^ ^ °'T. 

q r •' .:.vn-.nd v:vo-disDlay.of.single-cnain • 

■pr«ch» display methods include methods ronny^n.o. . . . 

. ' H . ■ rnn | vw n-sc"scfv displayed on phage.which enabk . 

. antifcodlss, such as nascent scrv on poivsam.sc. ^. ? - 

■ ... u-^ , ^v-.'si , vo f var able region sequent 

15 large-scale serving of scrv.Hbrur.es r.^.r.* o. ... _ . . 

• ' and binding specificities. • : , .• : : ■•: , • " Ar^.A 

■■' St^--1-~ vovH-t rartdom. pseudorandom, and dcunerj 
...-4 •• The orcsent invention. ...■.o...~^ r. ; . • . 

■ ' , ... ; iV ,_- w,~.. : hod5 : for generating and screen^ '.nose 
' . s-quencc framework pepude liO:a„e., — • > •■ v . . . ■ 

■ q . ' - _•.,„, '^ : id.s Eluding single.chainant.bod.es>. . 
• . .i:(bnL%s. : i6 Identify useful compou.** l-v.P-.-'-^ . - , ,- ; 

; V" ' ' „- ' . . .- o-'cene products mat modiry 

-»[; "tnat bind to receptor moieties or tor :s ?. ,. • .- . . 

" ~ '. V'-:v_ nd0T pseudcVandom, and deunedscquer.ee, 

.I ■" ^ pcpLdea * P^ccd tara i,^ .r^** . ». -P™ 

allayed pcp. id " « aapW- -«^'^= d ''" l'^ 

■^•W^co^^^^ 

: ' encapsulation in a fphagc particle or .neoo^on ,n a cell. ... . 

A melhod of affinity enrichncn'. a!!o:.vs a very ' ar S c liorary of pcptid-s and 
' ■ sine „c„ain anttbndies tobe screened =e, ,,e gptynodertMe seance encod,„ g , = 
' desLp*pude<s>ors,„ 8 ,e-cna,nan,ibcd,es,o t; s=l=c<cd. The po„™,co U d= can n 
30 be ,so,a,ed and Muffled ,o recombtnc combinatory tne -no acid seooence of Ihe 
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selected peptide(s) (or predetermined portions thereof) or single-chain antibodies (or just 
- VHI, VLI or CDR portions thereof). Using these methods, one can identify a peptide or 
single-chain antibody as having a desired binding affinity for a molecule and can exploit 
the process of shuffling to converge rapidly to a desired high-affinity peptide or scfV. 
5 The peptide or antibody can then be synthesized in bulk by conventional, means for any 
suitable use (e.g., as a therapeutic or diagnostic agen:). . 

A significant advantage of the present invention is that no prior information 
regarding an expected ligahd structure is required to isolate peptide iigands or antibodies 
of interest. The peptide identified can have biological activity, which is meant to include 
10 at least specific binding affinity for a selected receptor molecule and, in some instances^ 
will further include the ability to block the binding of other compounds, to Stimulate or 
inhibit metabolic pathways, to act as a signal or messenger, to stimulate or inhibit cellular 
activity, and the like. . 

The present invention also provides a method for shuffling a pool of 
15 polynucleotide sequences selected by affinity screening a r: brary of polysomes displaying 
nascent peptides (including single-chain antibodies) for library members which bind to 
a predetermined receptor (e.g., a mamma; ian protem-acecus receptor such as, for . 
example, a peptidergic hormone receptor, a ceil surface receptor, an intracellular protein 
which binds lo other prctein(s) to form intracellular protein complexes such as hctcro- 
20 dimcrs and the like) or epitope (e.g., an immobilized protein, giycoprotein/oJI.gosaccha- 
ride, and the like). 

Polynucleotide sequences selected in a first selection round (typically by 
affinity selection for binding to a receptor (e.g., a ligand)) by any of these methods arc 
pooled and the pool(s) is/arc shuffled by //: vitro and/or in vivo, recombination to produce. 

25 a shuffled pool comprising a- population of recombincd selected polynucleotide 
sequences. The recombined selected polynucleotide sequences. are subjected to at. least" 
one subsequent selection round. The poUT.ucleoti.de- sequences selected in * the • 
subsequent selection round(s) .can be used directly, sequenced, and/or subjected lo one 
or more additional- rounds of shuffling and subsequent selection. Selected sequences can 

30 . also/be back-crossed with polynucleotide, sequences encoding neutral sequences (i.e., 
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' ■ U„--^rs r»presehtinc a kcrnal ot sequence 

■ • like)- to generate a collccuon.ol -horary memoes r^resenung . . ■ . 

• •• ' 8 : ^ n r t ,, HecT-M^brarv member. The- sequence of an , 

-»v 'diversity based on the sequence o. L.-.s ? lec ~ i.„ . ... 
' ' ■ ■ • ' •' u P . n ; ^a. V,n be manipulated to incorporate random 

■ individual selected hbrary member o, poo,.,., P^.m ... 

. , ' . „,< ...->i — jta'ion lie., comprising, variant and 

15' mutation, pseudorandom mutation, del'.neu ke,-..-.! ...ata.ion t, .P ^ 

^ , nva , an ^sidue pos.t.ons an.o, com P ris,, S ^ariant rescue . pos.on, ^ c . : 

• ^ residue seiected-frbm ^ »** of amino acid residues),ooon-b,e, 

■ •" .■•:^r-,■Tutac-Pized:scicctcd.^^brary.rncmbers,arcthen: 

■ " selected.library member scq.ucr.c-.. t ... maU^.n.z . .,. . . 

■ - .' .V ' . ; .. v.-^b'ihStori'ai shuffling as disclosed herein. ■ ■ ; 
->0 snuffle- bv ir. v.-zroand'er ;« ••■•:>. --^-^ .... • • • 

. : ■ ' Th .;,,„ ;i 0„ S b, proves pepude .tones compnsi^ a p««y °> 

. ,■.:^^*i^!).-^7 t , *:.■ 

'•: ' is : , ^J^**** * «** ftom f 1 

; z*. , — -t fc T h 

' one copy pe, fib™, due .0 uneven -^loc, *** »* 
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specificity arc formed by the process of: (1 ) screening a displayed peptide or displayed 
single-chain antibody library against a predetermined receptor (e.g., ligand) or epitope 
(e.g., antigen macromoiecule) and identifying and/or enriching library members which 
bind to the predetermined receptor or epitope to produce a pool of selected library 
5 members, (2) shuffling by recombination the selected library members (or amplified or 
cloned copies thereof) which binds the predetermined epitope and has been thereby 
isolated and/or enriched from the library to generate a shuffled library, and (3) screening 
the shuffled library against the predetermined receptor (e.g., ligand) or epitope (e.g., 
antigen macromoiecule) and identifying and/or enriching shuffled library members which 
10 bind to the predetermined receptor or epitope to produce a pool of selected shuffled ■ 
library members. 

Antibody Display and Screening Methods 
The present method can be used ic shuffle, by in vitro and/or in .vivo 
recombination by any of the disclosed methods, ar.d in any combination, polynucleotide 

15 sequences selected by antibody display methods. .wherein an associated polynucleotide 
encodes a displayed antibody which is screened for a phenotype (e.g., for affinity for- ■ 
binding a predetermined antigen (ligand). 

Various molecular cenc:ic approaches have been devised to capture the vast 
immunological repertoire represented by :he extremely large number of distinct variable 

20 regions which can be present in immunoglobulin chains. The naturally-occurring germ 
line immunoglobulin heavy chain locus is composed of separate tandem arrays of ' 
variable segment genes located upstream of a tandem array of diversity segment genes, 
which are -themselves located upstream of a tandem array, of joining (i) region genes; ■ 
which are located upstream of the constant region genes. During' B lymphocyte 

25 development, V-D-J rearrangement occurs wherein a heavy chain variable region gene ' 
(VH) is formed by rearrangement to form a fused D segment followed by rearrangement 
with a V segment to form a V-D-J joined product gene which, if productively rearranged, 
encodes a functional variable region (VI I) of a heavy chain. Similarly, light chain loci 
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-. Arrange one of several V segmen.s wi,h one of several J. segment .0 form a : gene 
' encodin8thcvari a bler=gio,n(VL>ofalightchai(i. . . : 

The vast repcrio.re of variable reg,ons possible in immunoglobulins denves . ,. 
iB par, from .he numerous comb,na,oria,. possibles of joining V and i segment (and, 
5 _, ,,e ,« o: heov, d*. loci. D during —Semen, in P. ceil deveiopmen.. 

■: ; Ad di,io„al seauence d.vershy in .he heavy chain variabU regions arises from 
, 4-^form «mcn.s of.be D segm f n,s during V-D, iomingand^m H reg,™ 
• ' addirion. Forihcr, an.igen-selec.ion of specie B ee„ Cones selec.s for ** 
' . V 3ria„,s having non-germlinc mu,a,io„ S i,onc o, bo,h of ,hc 
,„ variable regions, a phenomenon referred ,o " 

sharpening''. Typical,, .he, ^ ~ elus.er ,n sp.c fic , 

. ; df ,he va„ab,e region, mos. common,,; in .he eornp,crr,c„,a„.y.d : ,c™,ng ;,e g ,ons 

*•* (CDRS \ • ^;. rf .C i;. r ;„:ions in producing and identifying 

• In order to overcome marmot c;u. ..a... r io..> p . •• . 

■ ' V : * . ■- . • .-••-nu'a'-d -fl- cell' development (i.e., 

15 hi^h-afflnity immunoglobulins tnrougn a.. ...... • ■ .. . . , : - : 

' • ■ , . ■ ...: v ^ s have been 'developed that. can, be • 

immunization), various prokaryotsc exp.^o.; ,;^.,.sr .. . ■ 

' % " \- VnMb-av rbrunes which may be screened to. 

manipulated to; produce comybmatona: ana^n. I.e. , ; . . 

■'■ • ■' A . ' ■ ,- ~ T--'.n.^n- Recent' advances in the expression 

hrch-affinity aiUibod.es. to. specif .a. !? .g.n,. N*~r . ;• . 

'.' ■ . . • •■■ ,,.-. cr ,c fc-c ."Alternative -Peptide . 

. . antibodies, in Escherichia, "coli ana bactenopnag, s ; ,cms 1^. . ■ 

■ ■ • - ';:' r ,. 1 ;:..j--" nossibil?.v that virtually any spec.fic.ty can be 

'^0 Displav Methods", intra) ,na\e raotu ^-.pos^oii. .. . . . _ . 

~. ■ ■ . ' " .. j. . fr0 ~ C hn-»cteri2ed hybridomas.or byde novo . 

:.- oh.: »ned by. either cio.m:.;; a:.:.eo.. ; . $•.<-. .i* :.o..: en...- . . . : . 

:sel=et,onusin g antibedyg=nel.brancsXb.g,tVomI S CpNA). ■ . ■ V ' 

■ ' '* 1 - ; *■ , K^n'oen^ra'ed in bacteriophage- 

. . Combinatorial libraneso: -ar.t iooq.es nav. b^n gen.r.-a, .... ^ 

'• ' "i-v, m 3 --o» screened as bacteriophage plaques or as. 

. ; lambda, expression.. systems wh,cn ma. D^ scr^n . , ( ; 

%j coloniesoflysogensC^ 

(USA) 82: 8095; Persson e, a,. ^^^mi^Jm^ W ^ 
' ' various embodimen, of baccriophage ...ibody disp.av libraries and ,a m bda' phage 

cpressron l.braries have been desenbed fl^.g « (.W.te^i-L^. 
.30 .U...A.88: 4363; Clackson e, a,. (.99., >^e 352: 624; MeCafferiy « al. (1990, 
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Nature 348: 552; Burton et al. (1991) Proc. Natl- Acad. Sci. (U.S.A.) 88: 10134; 
Hoogenboom et al. f 1 991) Nucleic Acids Res . 19: 4133; Chang et al. (1991) J. Immunol . 
147: 3610; Breitling et al, (199-1.) Gene 104: 147; Marks et al. (1991) J. Mol. Biol 
222@: 581; Barbae et al. (1992) Proc. Natl. Acad. Sci. (U.S.A.) 89: 4457: Hawkins and 
5 Winter (1992) J. Immunol. 22: 867; Marks et al. (1992) Biotechnology 10: 779: Marks 
et ah ( 1 992) J. Biol. Chem. 267: 1 6007; Lowman ct al (1 99 1) Biochemistry 30: 1 0832; 
Lcrner et al. (1 992) Science. 258: 1313, incorporated herein by reference). Typically, a 
bacteriophage antibody display library is screened with a receptor (e.g., polypeptide, 
carbohydrate, glycoprotein, nucleic acid) that is immobilized (e.g., by covaleni linkage 
10 to a chromatography resin, to enrich for reactive phage by affinity chromatography)- 
and/or labeled (e.g., to screen plaque or colony lifts). 1 

One particularly advantageous approach has been the use of so-called 
single-chain fragment variable. (scfv) libraries (Marks et al. (1992) Biotechnology 10: 
779; Winter G and Milstcin C (1991) Nature 349: 293: Ciackson et al. (1991) on. cit. : 
15 Marks et al. (4991) J/Moi. Biol. 222: 5S1: Chaudhary ct al. (1990) Proc. Natl. Acad. 
Sci. (USA) 87: 1066; Chiswcll et al..(1992) TJDTHCH 10: SO; McCaffenyct al. (1990) 
QP-cit.; and Huston ct al- (1988) Proc. Nat!. Acad. So. (USA) %5: 5879).. Various 
embodiments of scfv libraries displayed on ■ bacteriophage coat proteins have been 
described. 

-° Beginning in 1938', sinefe-chairv analogues of Fv fragments and thcirTusion 

proteins have been reliably generated by antibody engineering methods. The first step 
generally involves obtaining the genes encoding VH and VL domains with desired 
binding properties; these V genes may be isolated from a specific hybridoma cell .line, 
selected from a combinatorial V-gene library, or made by V gene synthesis. The 

25 single-chain Fv is formed by connecting the component V genes with an oligonucleotide 
that encodes an appropriately designed linker peptide, such as (Gly-Gly-G!y-GIy-Scr)3 
or equivalent linker peptide(s). The linker bridges the C-terminus of the first V region 
and N-tenminus of the second, ordered as either YH-iinker-VL or VL-linker-VH' In 
principle, the scfv binding site can faithfully repiicate both the affinity and specificity of 

30 its parent antibody combining site. 
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: Thus, scfv fragments are comprised of VH and VL domains linked into a 
single polypeptide chain by a flexible .inker peptide. Aner the scfv genes are assembled, 
. chey ,re cloned into a phagemid and expressed at the tip of the Ml 3 phage (or stm.lar: 
filamentous bacteriophage) as fusion proteins with the bacteriophage Pill (gene 3) coat 
5 • protein; Enriching for phage expressing an antibody of interest is accomphshed by 

panning the recombinant phage displaying a population, scfv. for binding to a predeter- - 
' mined epitope (e.g., target antigen, receptor). ...... . : - .: ... 

.The linked polynucleotide .of a library member provides , the basis for 
' replication of the library mernber after a screening or selection procedure, and a.so. 
,0 provides the basis for the determination, by nucleotide sequencing, of ^identity of the 
• displayed peptide sequence or VH and VI. amino acid sequence. . The displayed peptide . : 
^ (s) 0 , si n e le^ain antibody (e. g.. «*rj and/or its VH and VL domains or the.r CDRs can 
be cloned and expressed in suitable expression syster, oncn polynucleotides encodmg 
,. h , isolated VH and VL domainsAvili beiig^d to polynucleotides encodmg constant 
reeions (CH and CL) to fonn-^yn^es-. encoding complete antibody (eg., . 
■ ■ chimeric or fully-human). ..n.ibody fragment, and the like.. Often" ^nucleotides ; 

, .-'-i .-a rn^-wi'» - into polynucleotides encoding a suitable .. 

encoding the isolated CDa^mi, b .^_ — ^ . _ _ , ^ 

, < ■ rn-vit ^cionsUo form, polynucleotides. 

■ . variable region .framework (and opa^a^o.^n, • » ; 

' . .wf«f~o V--^d or fully-human), antibody fragments, and 

■' " encoding complete an'ibod.es lug.. ^—^ u " ^ ... 

' ■ -V j ■ r ■ ~ nr^arat-ive. Quantities of the -antigen by 

20 ,; *the like.' Antibodies can -Dc us-a 1-3 ibw.au, pr^ r aic* . 

■ ■ , *' ' 7 «;^u v v^.-v, s o*>- uses of such .antibodies are'to diagnose 

" in-munoafunitvcnromatograp.)- ..^ .0.^ o u^ , .. 

V ^ or sta^cdis^e ( e. & . neoplasia) and for therapeutic a P? lica;tio^^ 

' as for example: neoplasia, autoimmune disease, AIDS, cardiovascular disease. infecUons,; 

and the like , . , . . . . . , '. . . , . • 

' 25 : ; Various methods have been reponed for -increasing the- combmatonal 

diversity ' of a- scfv library to broaden the repertoire of binding species (idiotype 
' spectrum) The use of PCRhas permitted the v3ri,bVe regions to be rapidly cloned cuter 

from a specif.c hybri.doma source or as a £ =r, library from non-immunized cells. 

affording combinatorial diversity in the assortment of VH and VL cassettes which can 
30 be combined. Furthermore, the VH and VL casscf.es can themselves be diverged, such 
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as by random, pseudorandom, or directed mutagenesis. Typically, VH and VL cassettes 
arc diversified in or near the complementarity-determining regions (CDRS), often the 
tliird CDR, CDR3. Enzymatic inverse PCR mutagenesis has been shown to be a simple 
and reliable method for constructing relatively large libraries of scfv site-directed 
5 mutants (Stemmer et al. (1993) Biotechniques 14: 256), as has error-prone PCR and 
chemical mutagenesis (Deng et al. (1994) J. Biol. Chem. 269: 953 3). Ricchmann et al. 

(1993) Biochemistry 32: 8848 showed semi-rational design of an antibody scfv fragment 
using site-directed randomization by degenerate oligonucleotide PCR and subsequent 
phage display of the resultant scfv mutants. Darbas et a!, (1992) on.cit . attempted to 

10 circumvent the problem of limited repertoire sizes resulting' from using biased variable, 
region sequences by randomizing the sequence in a synthetic CDR region of a human, 
tetanus toxoid-binding Fab. 

CDR randomization has the potential to create approximately 1 x 10 :o CDRs 
for the heavy chain CDR3 alone, and a roughly similar number of variants of the heavy 

15 chain CDR1 and CDR2, and light chain CDR.; -3 variants. Taken individually or 
together, the combination possibilities of CDR randomization of heavy and/or light 
chains requires generating a prohibitive number of bacteriophage clones to produce a 
clone library representing a!i possible combinations, the vast'majority of which will be 
non-binding. Generation of such large numbers of primary transformants is not feasible 

20 with current transformation technology and bacteriophage display systems. For example. 
. Barbas et al. (1992) op.cit . only generated '5 x tO 7 transformants, which represents only 
a liny fraction of the potential diversity of a library of thoroughly randomized CDRS. 

Despite these substantial limitations, bacteriophage, display" of scfv have 
already yielded a variety of useful antibodies and antibody fusion proteins. A bispecific 

25 single chain antibody has been shown to mediate -efficient tumor cell lysis (G ruber et al: 

(1994) J/ lmmunol , 152: 5368). Intracellular expression of an anti-Rev scfv has been - 
shown to inhibit HIV-1 virus replication in vitro (Duan ct al/(1994) Proc. Natl. Acad. 
Sci. (TJS AV91: 5075), and intracellular expression- of an anti-p21rar, scfv has been shown 
to inhibrt meiotic maturation of Xenopus oocytes (Biocca et al. (1993) Hiochem. 

30 -Bioshvs. Res. Commun . ] 97: 422. Recombinant scfv which can be used to diagnose 
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H,V infecion have *. *«« d-on»a,ing * «** * 

tut „ .. (.-U immr- "'^">- Fusi0D ""*"'"' " S ' 

,. k *. second po.ypepride. such as , ,o>,n or fibrinoid ac^or prprein ^ 
been reporied (Hblvos, «il (.992) B^JJM^ ** « 

5 . Biol. Che'm. 268: 5302). , •. . ' . ■ 

/ , fi , w« po K ,bl. ,0 eentra ,= .Wibrarics havi„ E broader . n „bodyd,v= re „ 

■ .„,:^n,: .an, -,f ,,e limitations of conventional CDR 

random i M ,io„ methods ,hich: can cover 0„!v a ve„ tinv ^ 

,0 a^,a E o.*'u S =c„ U ,dbeva*; m prov rf To address W« and,„„v 0 

iced Piaved an,,*, Such W — - * ~ ^ 

' • ;. •. ., „ r .„ n .. r antibody display system wherein. 
• bacterrophage panicles, on polysomes, o. an.. , P; 

r >. • •: infs'i In a variation, the CUK.S ar_ 

15 aniibodv is associated wuh us encoding d(s). i 

' - ' • DVWn , rn v ^ ,v,-; an=ibod)-produci n& c e )ls-(e.g;, plasma 

initiallv'obtaincd from mRN'A (or cDNA}. an..- y y . 

: ' ■ ' . ' . A Vcse a human/or a transgenic mouse 

v cclls/splshocvtcs- from an imrnumzec uiiu-t,?- mc-. 

? ' .... \ . 3 in W092/0391S. W093/12227. ana 

capable o: making a numan an^-v ^ - 

\v0^^55S5), : ir.cluding hybriGomas derived therefrom. . „ , 

' •• • ■ c ,.,-.,; Vf ;rs < selection round (typically by 

" se.cuon. for d,s ? U,=d ^ - <«*• * W 

■ • rcco.binauon, espcua„v .huHlin, of CD* **** »«» *» ^ , 

■• rnp.miho'lltr Uthl chain CDRS) lo produce, 
ctf.vheav, chain CDRs and hint cnam CDRs wlno.n., U6 . . 
25 a sh*d .po„l co.pr.s.ns -a popular ' of rccombined sCccted po.vnucrco* 
scqu;ncts The .combined seiected p*,*.*.* seouences are expressed ,n a. 

, t ,-u^ v ..... vh'rcted to at least one subsequent 
selection format as a d.splayed anlibod> a..., s-bj.cua 

, ,v- -3 selected in the subsequent selection 

selection round. Iht pol>-nucleot.d 6 s,o> . . 

~r*,\ c; U bicctcd to one or more additional 

round(s) can be used directly, sequenced, a... o. subj-ctc 

— ,ii> nn mtibodv of the desired binding 
30 rounds of shuffling and subsequent selecuon unti, an antibody 
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affinity is obtained. Selected sequences can also be back-crossed with polynucleotide 
sequences encoding neutral antibody framework sequences (i.e., having insubstantial 
functional effect on antigen binding), such as for example by back-crossing with a human 
variable region framework to produce human-like sequence antibodies. Generally, 
5 during back-crossing subsequent selection is applied to retain the property of binding to 
the predetermined antigen. * 

Alternatively, or in combination with the noted variations/the valency of the 
target epitope may be varied to control" the average binding affinity of selected scfv 
library members. The target epitope can -be bound to a surface or subsirate at varying: 
10 densities, such as by including a competitor epitope, by dilution, or by other method- 
known to those in the an. A high density (valency) of predetermined epitope can be used 
to enrich for scfv library members which have relatively low affinity, whereas *a low 
density (valency) can preferentially enrich for higher a f-: miry. scfv iibrarv members. 

For generating divcr.se variable 'segments, a collection" of synthetic 
15 oligonucleotides encoding random, pssu : jrandom, or a defined sequence kernal set of " 
peptide sequences', can be inserted by ligation into a predetermined site (e,g.,\a CDR). 
. Similarly, the sequence diversity of one-cr more CDRs of the single-chain antibody 
cassctte(s) can.be expanded by mutating the CDR(s) with site-directed .mutagenesis, 
COR-repiaccmcnt, and the like. The rcsulran-. DNA molecules can be propagated in a 
~ ! -' hos: for cloning and .amplification prior to shuffling, or can be used directly (i.e., may 
avoid loss of diversity which may occur upon -propagation in a host cell).imd the selected' 
library members subsequently shuffled. ■-.'''* ^ 

Displayed peptide/polynucleotide complexes (library members) which encode, 
a variable segment peptide sequence of interest or a sir.gie-chain antibody;of interest arc 
25 selected from the library by an affinity enrichment -technique. This is accomplished by ' 
means of a immobilized macromoleculc or epitope specific for the peptide sequence of . 
interest, such as a receptor, other macromoleculc, or other epitope species. Repeating 
the affinity selection procedure provides an enrichment of library; members encoding the '. 
desired sequences, which may then be isolated for pooling and shuffling, for sequencing, ' 
30 and/or for further propagation and affinity enrichment. . ■■ ■ * 
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The librae members without .he desiredspeciftcity are removed by washtng, 

• The degree and stringency of washing required will be determined for each pepttde 
seouence or stngie-chain antibody of interest and ,he immobiUzed predefined . 

epitope. A certain degree of eontro, can be exerted.ver the btndtng. 
5 characters of the nascent peptide/DNA complexes recovered by adjusttng the 
C o„di t io„ S o f , W bi„din 8 ,nc U ba,iona„d,h=s U bse, U e„«;was h in, 

■ ionic strengrh, div.cn, cattons concentration, and *e voiume and dura„o„. of *. 
' washing „iU se.eC for nascent peptide/DNA complexes within particular ranges f 

afM , y l, h eimmobili 2 edmaeromp,ecu, Section bascdon s.ow dissoc.a^ 
,0 Whtch is usualiy predicve of high aff.ntty. is often *. most pracca, route, m 

• ■be done either by continued tneubation in *, presence of a saturate amount of f - 
^determined macromoiecuie.o: by increase the voiume. nomoer. and lenguhofth ; 
washes. In each case, the rebinding of dissociated nascent peplid'e/DN A or peplide/^A 
compicx is prevented; a„d.wi.h mceasin, „m, nas.r., p=p,ide/DNA or pcpt,d=,R»A 

15 complexesofhigherandhiEherafrtniryarerecoy-.rc;, . 

Additional modiftcations of the bWi. E and washing procedures may b. 

■ .pohcd to ftnd peptides w„hsp=c;aUha,ac.,r;st,=s Thcfftntt.s of some peptides are 

■ .' . • Th ; s is a useful characterisnc tor 

.' dependtnton ionic strength or.canon-ccnccn^o*. Tho u 

, ■ >r--.v nitrification of various proteins, when gentle 
peptides that will be used in, aif.nhv purhicauo... 0i. 

oq: conditions for removing the protein. from the peptides ^required, , - 

■ . " or. variation it-.voivcs.tb, use of multiple binding targets ( mul» R »e epuopc ; 

.ecies, multiple reeeptor specif s.ch that .'serf library can ,e — ou,ly 

screened for a muitipliaty of scfwvhich have different binding specificities. Given that 

, . /&. size of.a scfv library often limits .he diversity of potential scfv seances. U is ; 

. .... ' • e u... a size as' possible. The time and 
25 typically desirable to us. scry horancs.-o, as a. stz v 

r .• „,,mv,;r nf very' laree polysome scFv-display 

economic considerations of generating a numoer of very larg p . 

libraries can become prohibitive. To avoid this substantial problem, multiple predeter- 
mined epitope species (receptor species) can be concomitantly screened in a angle 
■ . • Horary, or sequential screening against a number of epitope species can be used, n one 
30 vacation, multiple target epitope species, each encoded on a separate bead (or subset of 
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beads), can be mixed and incubated with a polysome-display scfv library under suitable 
binding conditions. The collection of beads, comprising multiple epitope species, can 
then be used to isolate, by affinity selection, scfv library members. Generally, 
subsequent affinity screening rounds can include the same mixture of beads, subsets 
5 thereof, or beads containing only one or two individual epitope species. This approach 
affords efficient screening, and is compatible with laboratory automation, batch 
processing, and high throughput screening methods. 

A variety of techniques can be uscd'in the present invention to diversify a 
peptide library or single-chain antibody library,' or to diversify, 'prior to or concomitant 
10 with shuffling, around variable. segment peptides found in early rounds of panning to-, 
have sufficient binding activity to the predetermined macromolcculc or epitope. In one 
approach, the positive selected peptide/polynucleotide complexes (those identified in an 
carlyround of affinity enrichment) are sequenced to determine the identity of the active 
peptides. Oligonucleotides axe then synthesized based on these active peptide sequences, ■ 
15 employing a low level of all bases incorporated, a; each step to produce slight variations 
of the primary oligonucleotide sequences. ■ This mixture of (slightly) degenerate 
-.oligonucleotides is then cjoned into the variable segment sequences at the appropriate 
locations. This method produces systematic, controlled variations of the starting peptide 
sequences, which can. then-be shuffled, k requires, however, that individual positive 
20 nascent peptide/polynucleotide complexes be sequenced before mutagenesis, and thus, 
is useful for expanding the diversity of small numbers of recovered complexes and 
selecting variants having higher binding affinity and/or higher binding specificity. In a 
variation, mutagenic PCR amplification of positive selected peptide/polynucleotide- 
complexes (especially of the variable region sequences, the amplification products of 
25 which are shuffled in vitro and/or in vivo and one or more additional rounds of screening 
is done prior to. sequencing. The same genc;a! : approach can be employed with- 
single-chain antibodies in order to expand the diversity and enhance the binding 
affinity/specificity, typically by diversifying CDRs or adjacent-framework regions prior 
to or concomitant with shuffling. If desired, shuffling reactions can be spiked with 
30 mutagenic oligonucleotides capable of in vitro recombination with the selected library 
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mcMm can be inched. Thus, fixtures of syndic „ B go„uc,eo,id e s^ PCR , 

added to me> v^ . ■ •. 

.^embersCshufflants)/ ■ ; - " 
, ' • The present invention of shuffling enables i.ne g . . . 

• -u ,v flft-wav to generate such amibod^s -wtoMnsertv / ■ 
rnR variant single^hain antibodies. One \vay 10 b<= . . v . 

, ■ CDR-yaria Rising . .-w rt /iv and/or 'CDR randomization prior to or . - 

' 1 ' synthetic CDRs into the single-chain ant.body and/or L-UK . _ . 
, ^ :v hufflint ThesequencesofthesyntheticCDRcassettesareselected 

■ .copcomnant wUhshufOmg. The q ^ . ^ ^ 

by refening to —quence data of hunian CDR nd . ^ ^ , 
^ ^ Petitioner according <o the . ^ ^ 

• hive at 1-asf 50 to 70 . percent posmpnaUcqu-r.c. u^K- ; . ; 

' ■■ - h?V ' e at ■ ■ ■ .vtic CDR seOu= Pl ces.can.be generated^ synthesizing. • 

■ ; " Tor example, axoUcction of synihct cCDK secu.... - r 'nR 
: • P /. ' " ,7 ■ ^ on th . b^i3 o: nat. U r ; a!ly-occumn g: .human.CDR 

V5 scu-hcevlisted in Kabafet ak (1:99 l.)«ttfiil< ^ P^» < s > 01 , • .. • ^ , 
^ • , , V CDR p-pti- sequels h.vingat least 40 percent sequencc- 

• arecalculatcd-to,encodeCDR.p.pn^ .q . ^ . Alternatively,. • 
id 7, litv to at. least one-known natural iy-occurn-ngnumar, CDR-Scqucn . . . 

■•' ' ' i - ■ " • ' rnr^nvrnc-s ' may be compared^ genera. c , 

■ , 'collection- of naturally-occur^S .CDi v s-qu-nc..s, . • . . 

' - " 001 . ' -i",..^ - residucpositibn: frequently (i.e:. in at; , 

«• .consensus seouences^sblhat a.m.r.o ac.ds.vsed a, -rcsia. ,P 

• •: r : ": . ' riKs-r-sl arc incorporated into tne synthetic CDRs a V . 
' 90 least 5 percent o: know:: J:\ •.- , . . . . ? . 

■ • " ' ^ t c^crl (c R.i 3 to about i0 . known. CDR. 

V , sequences are compared and obeyed nau,.a..s^ ; _ . 

; ; C0RS We tabulated..- a collect™ ;of o.igonocleottdes cn S ,d,n, CT R ^ 

- ■ V ■ • .. „ a „ 0I mis permutations of the observed. na.uraV sequence 

^nuenccs enxompassir.g all or mosi.pc.ni , 

- ; , , o: e ;^p!» bu: not for linutatic::, ^collection of human VH 

• 25'. variations is synthesized- .For exunp.. ou.j. . . . . _ 

, .^boxy-terminal, amino acids which are either Tyr. N ^ <" 

' ITlox Jer^a, COR residue lo be any of these ajniro acids. ,h so m e ero 0 ■ 

!. m , residues other thao „ose »h,ch « * ^ 

' 3 „:;,ec,;o n ofCORse q „e„eesa,e,oc„r,.ra,ed:eoos.r.„ve™,„oae l d SU bs,„ U t,o„sare 
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frequently incorporated and up to 5 residue positions maybe varied to incorporate 
non-conservative amino acid substitutions as compared to known naturally-occurring 
CDR sequences. Such CDR sequences can be used in primary library members (prior 
to first round screening) and/or can be used to spike in vitro shuffling reactions of 
5 selected library member- sequences. Construction of such pools of defined and/or 
degenerate sequences wilt be readily accomplished by those of ordinary skill in the art. 

The collection of synthetic CDR sequences comprises at least one member 
that is not known to be a naturally-occurring CDR sequence. It is within the discretion 
of the practitioner to include or not include a portion of random or pseudorandom 

10 sequence corresponding to N region addition in the heavy chain CDR; the N region, 
sequence ranges from 1 nucleotide to about 4 nucleotides occurring at V-D and D*J 
junctions. A collection of synthetic heavy chain CDR sequences comprises at least about 
100 unique CDR sequences, typically at. least a-v- 1-.000 unique CDR sequences, 
preferably at least about 10,000 unique CDR sequences, frequently more than 50,000 

15- unique CDR sequences; however, usually not mere than about 1x10 6 unique CDR 
sequences axe included in the collection, although cc-casionally 1.x 107 to 1 X 108 unique 
CDR sequences are present, especially if conservative amino acid substitutions arc 
permitted at positions where the conservative amine- acid substituent is not present or is 
rare (i.e.. less than 0. 1 percent) in that position in naturally-occurring human CDRS. In 

20 general; the number of unique CDR sequences included in a library should not exceed 
the expected number of prirharv tr^r.o form ants in the library by more than a factor of 1 0. 
Such smsle-chain antibodies generally bind of about at least 1 x 10 m-, preferably with 
an affinity of about at least, 5 x 1 0 (superscript 7) NM, more preferably with an affinity 
.of at least 1 x 10 (superscript S) M-l. to 1 x. 10 (superscript 9) M-l or more, sometimes 

25 up to 1 x 10 (superscript 10) M-l or more. Frequently, the predetermined antigen is a 
human protein, such as for example a human ceil surface antigen (c. g., CD4, CDS, IL-2 
receptor,. EGF. receptor, PDGF receptor), other ^ human biological macromolecule (e.g., 
thrombomodulin, protein C, carbohydrate antigen sialyl Lewis antigen, Lselcctin), or 
nonhuman disease associated macromolecule J (e. z , bacterial LPS, virioacapsid protein 

30 or envelope glycoprotein) and the like. 



* WO 98/01581 



PCT/US97/12239 



, - 59 



, in plams « al. 0 W BaBU— — — ? ^ 168; ,49; . 

; lhcro or<Ke««oro„gb e< al. (1W JL__--^ o; • 
e „ codme stqut „„ ^ be * ^ ^ „ „ cod! . _ ■ 

spli eed ,o . a * s " ed hum - • „ berc • 

' - sequel) xan beeped in a hos, c J <,-,.. . , , 

; mamm aHa„ -caw » u3c toexp[CSSiM conirol : 

^ " Pr " !i0n i„e,i g „a, un ,b-ocia, rf 

,5 . DNA ^ ^ >^ .-^ t ■:;::;;^ i „nco„ i ;. s eqU enc=sw,„ be 
o, be.erolosops promo,-.: region. . • . ^ ^^g. „, lisn!fcc „„g 

; ^ promote: ££££ ^ , m0 ^ appropria.e bps, 

. c*»0;or, bos, cCU. One. b. ^ ^ ^ ^ of lh e 

- .the host .is maintained unci- c ;; , • ■ - • ■'engineered" 

-. ' .. • • ■ , ,u,Tnii:.-'i^T arid Dunfication of at- muuii. . , e 

nucleotide sequences/ and thsxo!., v i-^ . . 



20' ■ 

antibodies 



. , . ■■' ' : ■ ' , ,:. ih» DN \ sequences wil 1 - be expresscdin hosts after the 

As stated prev.ousV.th.D.sAsq j ( u positioned 
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In addition to eukaryotic microorganisms such asyeast, mammalian tissue 
cell culture may also be used to produce the polypeptides of the present invention (set . 
Winnacker, "From Genes to Clones/* VCH Publishers, N.L, N.Y. (1987), which is 
incorporated herein by reference). Eukaryotic cells are actually preferred, because a 
5 number of suitable host cell lines capable of secreting intact immunoglobulins have been 
. developed in the art, and include the CHO cell lines, various COS cell lines, HeLa cells, 
myeloma cell lines, etc, but preferably transformed Bcells or hybridomas. Expression 
vectors for these cells can include expression control sequences, such as an origin of 
replication, a promoter, an enhancer (Queen et al. (1 986) Immunol. Rev. 89: 49), and 
10 necessary processing information sites, such as ribosome binding sites, RNA splice sites, *"* 
polyadcnylation sites, and transcriptional terminator sequences. Preferred expression 
control sequences are promoters derived from immunoglobulin genes, cytomegalovirus,- 
SV40., Adenovirus, Bovine Papilloma. Vinis, and the like. 

Eukaryotic DNA transcription car. be increased by inserting an enhancer 
15 sequence into the vector. Enhancers are cis-acting sequences of between )0io 300 bp 
that increase transcription by a promoter. Enhancers can effectively increase transcrip- 
tion when cither 51 or 3 1 to the transcription unit.' They are also effective if located 
within an intron or within the coding sequence itself. Typically, viral enhancers are used, 
including SV40 enhancers, cytomegalovirus enhancers,, polyoma enhancers, and 
20 adenovirus enhancers. Enhancer sequences from mammalian systems are also commonly.. 
• used, such as the mouse immunoglobulin heavy chain enhancer. 

Mammalian expression vector systems will also typically include a selectable 
marker gene. Examples of suitable markers include, the dihydrpfolate reductase gene 
(DHFR), the thymidine kinase gene (TK), or prokaryotic genes conferring drug 
25 resistance. The first two marker genes prefer the use of mutant cell lines that lack the 
ability to grow without the addition of thymidine to the growth medium. Transformed 
cells can then be . identified by their ability to grow on non-supplemented .media. 
Examples of prokaryotic drug resistance genes useful as markers include genes 
conferring resistance to G41S : mycophenolic acid-'and'hygromycin. 
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. - The vectors containing the DN A segments of interest can be transferred into 
the host, cell by well-known method, depending^ the type of cellular host., For 
example ca.cium chloride transfection is commonly utilized for prokaryotic cells, 
whereas calcium phosphate treatment, lipofcction, or electroporation may be used for 
5 chcr cellular, host, Other methods used to transform mammalian cells-include* the use 
ofPolybrcne, protoplast fusion, liposomes." electroporation, and micro-injection [see, 

generally . Sambrook et al„ sugia). . • ■ ' . • - : 

Once expressed, the antibodievindividual mutated immunoglobulin chams, 

. ; m ^cd antibody fragments, and pthe: immunoglobulin polypeptides of the invenUon 
10 be punned according to standard procedures of the a,, including ammonium su fate 

palpitation, fraction column chromatography, gel electrophoresis and thehke ^ . 
' ^ Scopes, R., Pioiejn Punficatipn, Spnn ? er-Vcriag,N:Y. (1 932)). once punfie . 

p^vortohomogeneity^^ 

" or in developing and Arming assay p.e.d^. i^.no.uorescent; minings ^ t,^ 

.. , • , ,, A - Jt vvc i-, n dll Eds Lefkovits and Perms, 
15 vi, L^ Lrirnll- ! laical Methoos,'.o:s. i , ad. Jl. tab. u 

' AcademkPress.Ne W York,N.Y.(1979andl'9Sn). ,^ ; , 

' ' ■ Tr4 antibodies generated by the method of the present invention can be uscu 

. ■ -^diagnosis andW By way of iHus^ion and not limitation, they canbe used to. ■ 

• ' -' . a;-,— or vHl nf-ctions. - ; Fof treatment of cancer, the ■ 
• treat cancer, autoimmune d.sea^, or.M^i .n.-cuc. . 

' . , r-,^ „-„nw* nr-.f*r-huall-y on-cancer cells, sucn 

■no antibodies wi!l.r,?ica!lybin^ > . • . : 

' crb B-2. CEA, CD33,and many other antigens and binding members well known to • 
. those skilled in the an. - : • . - 

Shuffling can also be used to recombinatoriaUy diversify a pool of selected., 
,5 librarv members obtained by screening a twehybrid screening system to identify library . 
' " members which bind a predetermined polypeptide sequence. The selected library 
- m-mbcrs are pooled and shuffled by in tiro and/or ir. vivo recombination. The shuffled 
pool can then be screened in a yeast two hybrid system to select library members w ; h,ch 
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bind said predetermined polypeptide sequence (e. g. t and SH2 domain) or which bind an 
alternate predetermined polypeptide sequence (e.g.. an SH2 domain from another protein 
species). 

An approach to identifying polypeptide sequences which bind to 1 a 
5 predetermined polypeptide sequence has been to use a so-called "two-hybrid" system 
wherein the predetermined polypeptide sequence is present in a fusion protein (Chien et . 
al. (1991) Proc. Natl. Acad. Sci. fUSA) 88: 9578). This approach identifies 
protein-protein interactions in vivo through reconstitution of a transcriptional activator 
(Fields S and Song 0 (1989) Nature 340: 245), the yeast Ga!4 transcription protein. 

10 Typically, the method is based on the properties of the yeast Gal4 protein, which consists 
of separable domains responsible for DNA-binding and transcriptional activation. 
Polynucleotides encoding two hybrid proteins, one consisting of the yeast Gal4 
DNA-binding domain fused to a polypeptide sequence cf a known protein and the other 
consisting of the Gal4 activation domain fused to a polypeptide sequence of a second' 

15 protein, arc constructed and introduced into a yeas: host ceil. Intermolecdar binding 
between the two fusion proteins reconstitutes the Gal4 DNA-binding domain with the 
Gal4 activation domain, which leads to the transcriptional activation of a'reporter gene 
(e.g., lac:,' IJIS3) which is operabiy iiricd to a Ga!4 binding site. Typically, the 
two-hybrid method is used to identify novel polypeptide sequences which interact with 

20 a known protein (Silver SC and Hunt S'-V (1993) Mol. Bio!. Rep. 1 7: 1*55; Durfec et al.. . 
(1993) Genes Dcvcl. 7; 555; Yange: a!, f 1992) Science 257: 5S0;Luban ct al. (1993) 

Cell 73: - 1 067; Hardy et aJ( 1 992) Genes Devel: gf 80 1 ; Bar.e! ct al ( 1 9931 BiotechnYques 

14: 920; and Vojtck et al. (1993) Cell 74: 205). However, variations of the two-hybrid 
method have been used to -identify' mutations of a known protein that affect its binding 

25 to a- second known protein (Li B and Fields S (1993) FASEB J. 7: 957; Lalo et al. 
(1993) Proc. Natl. Acad. Sci. OJSA) 90: 5524; Jackson et al(1993) Mol. Cell. Biol. . 
Ul 2899; and Madura et a!. (1993) J. Biol- Chem. 26S: 12046). Two-hybrid systems 
have also been used to identify interacting structural domains of two known proteins 
(BardwclLet al. (1993) med. Microbial . 8: 1177; Chairabarty ct al. (1992) J. Biol. 

30 Chem. 267: 1 7493; Siaudinger et al. (1 993) J. Biol. Chem. 263: 4608; and Milne GT. 
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Example 1 

Generation of Random Size Polynucleotides Using U.V. Induced Photoproducts 

One microgram samples of template DNA are obtained and treated with U.V. 
light to cause the formation of dimers, including TT dimers, particularly purine dimers; 
5 U.V. exposure is limited so that only a few photoproducts are generated per gene on the 
template DNA sample. Multiple samples are treated with U.V. light for varying periods 
of time to obtain template DNA samples with varying numbers of dimers from U.V ; 
exposure. 

A random priming- kit which utilizes a non-proofreading polymcasc (for 
10 example, Prime-It II Random Primer Labeling kit by Slratagene Cloning Systems) is 
utilized to generate different size polynucleotides by priming at random sites on 
templates which are prepared by U.V. light. (as described above) and extending along the 
templates. The priming protocols such as described in the Prime-It II Random Primer 
Labeling kit may be utilized to extend the primers. The diners formed by U.V. exposure 
15 serve as a roadblock for the extension by the non-proofreading polymerase. Thus, -a pool 
of random size polynucleotides is present afier extension, with the random primers is 
finished. 

Example 2 

: Isolation of Random Size Polynucleotides 
20 , * Polynucleotides of interest which are generated according to Example 1 arc 

"are-gePiso!ated on a T. 5% agarose cel. Polynucleotides in the 100-300 bp range are cut 
out of the gel and 3 volumes of. 6 M Nal isadded to the . gel slice. The mixture is 
incubated at 50 °C for 1 0 minutes and 10 pi of glass milk (Bio 101) is added. , The ' 
mixture is spun for 1 minute and the supernatant is decanted. The pellet isAvashed with 
25 500 pi of Column Wash (Column Wash is 50% cthanol, lOmM Tris-HCi pH 7,5, 100 
mM NaCl and 2.5 mM EDTA) and spin for 1 minute, after which the supernatant js 
decanted. The washing, spinning and decanting steps are then repented. The glass milk 
pellet is Vesuspended in 20ul of H 2 0 and spun for 1 minute. DNA remains in the aqueous 
phase. 
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the above. Such variations are expected to be within the ordinary skill of the average 
practitioner and are encompassed by the present invention. 
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VVHATISCLAIMEDJS: ■ . ' 



' i "■• A; method for producing mutant polynucleotides. comprising: . ; 

"' - producing p6lynucleotides by blocking or interrupting a polynucleotide 
". synthesis or amplification prodess with a^merriberselected fVom the group listing of 
5 uviight one or more DNA adducts. DNA int.rcaJating, gents, DNA bindingprotems., t 

pPly.Krase inhibitors.r poisons, said member being capable of blocking or interning • 
: - synthesis ampliation, of a polynucleotide to provide a plurality of polynucleotides 
; ' , due ,o said polynucleotides being in various stages; of synthesis or.arnplificauon. and . 
10 : • subjecting saia polynucleotides to an amplificationprocedurc ta amphty one; 
■/ ; >rmore>f.the polynucleotide^ polynucleotides.' ' ■■■ ^ _ y ' 

' *• ; ■ ''" " ' ' f ^-wir- nil-' ■" ^oNtiucleoiides by a. s=rics .of steps , 
2 .A process tor producing, mu.-... ,.•>•.••■; ■ . - : . 

comprising: .' . • ' ' . 
' ■ ■ - (,) pacing oligonucle6tides,by blocking or interrupting a-polvnucleoedc 

•r^^istin^of UV liehv/bne^more-DNA. adducts^ DNA, i,tcM^ing ag^ ; cha,n 

terminaiors^cVor polymerase inhibitarsVpoi^ 
: , : b^ckik or inte^pun^ud^ce synthesis orlampliHcation and proyide a plural,* 
' - i: bf Nucleotides due to the,: being in various stages of synthesis of amplmcat.on. 
, 0 " ■ • ' (b - } denaturing' the" resulting single or double stranded oligonucleotides to . 
produce a., mixture -of. si'n £ le3hded Volynueleotides,:vptionally ^ 
.polynucleotides "into -polls of polynucleotides having Various lengths, and futh=r 

^ptionaUysubjc^ 
' ^ - amplify one or more oligonucleotides comprised oy at least one of the polynucleotide 

25 pools; 

(c) incubating a plurality of said polynucleotides or at least one pool of 
' said polynucleotides with a polymerase under conditions which result in annealing of 
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said singje.stranded-polynuclcotides at regioris-of identity between the single-stranded 
polynucleotides and formation of rnutagenizc'd double stranded polyhucleotidexhian; 

(d) repeating steps (c) and (d); ' . " 

(e) expressing at least one mutant polypeptide from said pblynucleotidechain, 
5 ' or chains; and ' * ; 

(0 screening said at least one mutant.polypeptidc for a useful, activity. 

. 3. A process according to claim 2, wherein' said adduct is member selected from 

.. the group consisting. of: UV light; (+)-CC- 1 065; (+)-CC-l065-(N3-Adenine); a N- 
acelylated-or deacetylated 4'-fluro-4-aminobiphenyl adduct' capable of inhibiting DNA- 

10 synthesis, or a N-acetylated or deacetylated 4-aminobipnenyi ' adduct. capable of 
inhibiting DMA synthesis; trivalent.chrorni.iirr.- a tnvatcnt chromium salt; a polycyclic ■ 
aromatic hydrocarbon -("PAH") DNA adduct capable of inhibiting DNA replication; 7- 
bromomethyl ; -benz[ fl ]anthraccnc ("BMA"); Kis':2-3-dibromopropyl)phosphate ("Tris- 
• BF"); 1..2-dftrorno-3*chloropropane ("DBC?").-2-brc:r. 3 acrolein (2BA); benzofajpyrene- 

1-5 7,8-dihydrodiol-9-10-cpoxide ("BPDE"); a ' p!a:ir.um(n) halogen .salt; N-hydroxy-2- ' 
amino:3-methylimidazo{4,5-yi-quinoline; " N-hydrbxy-2-amino- 1 -methyl-6- 
phcny!imidazo[4,5-/]-pyrid,nc, DNA irf'ercai-ing agents, DNA bindmg.protcms. triple / 

' ' hclix inning, agents, competing transcription polymerases, chain terminators.' and- . • 
polymerise inhibitors or poisons. ■. -. ' ' 



20 A proc:ess"accofi!:ng-':o-c:ai^ 2, v-hereln said DNA aiiduci is a' member 

selected from the group consisting, of UV light, (+)-CC-!065 and (-)-CC-l 065-(N3v 
Adenine). '-,'•' , • 

■/■ 1 A 'P rocess according to claim 4, rur.be: comprising.hcatirig said polynuclco- : 

-tides' and removing the DNA adduct, oradducts from said polynucleotide or-polynucleo- 
25 tide-pools. 
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6 A method for expressing a polypeptide comp.rising.producing a.pol^ucleo- 

tide according to cjairitf and comprising the.Mher steps of cloning said polynucleotide 
, imoa.y«tor or an expression vehicle imd - 

7 . , A vector or an expression vehicle -including a polynucleotide produced 
5 according toctaim 2. ./ • '* / ' 

• 8/ A polypeptide comprising * ieas^ ^ 

polynucleotide produced by the method according- to claim 2. , ... 
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